Saturday, February 9, 2019

Ontology of JVM instructions continued

In the previous post we classified JVM instructions, largely by considering their nature as operations in a stack machine. Two classes of operations stood out in this process : value push operations and generic generalizations of the typed operations of the JVM. The nullary push instructions (constant push instructions, local access, getstatic) correspond to the operands of a combiner. The generic instruction classes define classes of isomorphic operators that deal with different types, whose value can be determined by the compiler from operand types. These two things can be used to help build a language from the JVM instruction set.

One other detail worth considering is instruction parameters. A large percentage of the operations of the instruction set do not require instruction parameters, like the aforementioned value push instructions, the primitive transformations, the return instruction, instructions which take an array as a parameter, and various procedures among others. Others can have their instruction parameters eliminated through reflection. There are a few instructions, belonging to three different classes, that seem to be harder to eliminate. These instructions necessitate the use of extended prefix notation in any thin wrapper over the JVM.

Variable operations:
The variable operations require some variable information passed to them as an instruction parameter. The atomic variable modification operations require that the atomic variable is passed as an instruction parameter. The putfield requires that the field name as instruction parameter. Array store is distinguished from these others by the fact that it doesn't require any instruction parameters. I ultimately decided to solve this problem by introducing something like setf, which takes as an instruction parameter all the information about the generalized variable being modified.
(def generalized-variable-modification 
  '#{fstore_1 lstore_3 sastore bastore dastore 
    dstore_3 dstore_1 istore istore_0 astore_2 
    fastore astore istore_2 istore_1 iinc castore 
    lstore_2 istore_3 dstore_2 lstore dstore_0 
    putstatic lstore_1 fstore_2 astore_0 dstore 
    astore_1 putfield fstore_0 lastore iastore 
    fstore lstore_0 fstore_3 aastore astore_3})
Type operations:
The type operations require a type passed to them as instruction parameter. They come in two forms: the reference allocations and the reference type check operations. It is easy to see why the reference allocations are distinguished among the reference operations by the fact that they must have some type passed to them as an instruction parameter, because there isn't a reference created yet to do reflection on. With the reference type check operations, the entire operation is defined by some instruction type so they also belong to this category.
(def reference-allocation 
  '#{multianewarray new anewarray newarray})

(def reference-type-check
  '#{instanceof checkcast})
Jump operations:
The jump operations require some label as an instruction parameter. This includes all process control instructions except for those that deal with method call and return. As the instructions are defined by jumping to a label, the label must be passed as an instruction parameter.
(def jump
 '#{ifeq iflt ifne ifgt ifnull ifle ifge ifnonnull
    if_icmpgl if_icmple if_icmplt if_icmpne
    if_acmpeq if_icmpeq if_icmpgt if_acmpne
    jsr ret goto lookupswitch tableswitch})

No comments:

Post a Comment