Saturday, December 15, 2018

Lisp flavoured java control flow

Unconditional branches:
In order to implement a programming language isomorphic to the JVM instruction set, it is necessary to support goto statements, because that is how control flow is dealt with on a low level. Most high level languages on the JVM don't support this. To implement any language with branches, its good to be able to use symbols to define the branch targets.
(tag label)
(go label)
This is a trivial example, so it is not hard to see how this corresponds to bytecode.
label:
goto label
This syntax is mostly reminiscent of Common Lisp, much like the syntax for modification procedures except you can declare tags anywhere rather then in just a body. Common Lisp is influenced by the Lisp machine lisp so it is clearly best suited for low level programming.

Generalized conditionals:
The java virtual machine instruction set has several operations that deal with conditional branches. These require on certain conditions placed on operands on the stack, so they can be generalized by using appropriate logical operators. There are two types of logical operators that deal with single arguments operators that classify the sign of an argument and operators that determine rather or not a value is null or not. Then there are a variety of comparison operators. With the operations that determine sign it may not be obvious the to the Java programmer that is their actual purpose.
(isnonnegative num)
An operation like ispositive, isnegative, isnonnegative, isnonpositive, iszero, or isnotzero produces the correspond conditional branch instruction then as you would expect.
iload_0
ifge then
iconst_0
goto end_conditional
then:
iconst_1
end_conditional:
The builtin comparison operations which are limited to integers can be generalized with the appropriate comparison instructions for you.
(<= num1 num2)
The appropriate bytecode is then produced from this logical operation.
iload_0
iload_1
if_icmple then
iconst_0
goto end_conditional
then:
iconst_1
end_conditional:
If the values are compared are longs, then we can get the appropriate value with the proper comparison followed by a test on it. To do a conditional branch you can simply then use if with an appropriate conditional and it will generate the proper bytecode for you.
(if (<= num1 num2) (go label))
The presence of the if statement around the branch statement tells the compiler that you want to a conditional branch, so it can generate the proper bytecode for you, without increasing the complexity of the bytecode. Perhaps a different syntax could be created for this, but this works.
iload_0
iload_1
if_icmple label
This makes the instruction a bit more functional, which is a positive development, but it doesn't hide any functionality from the user or abstract anything anyway that you could use yourself in your programs.

Switches:
Lookupswitches can be dealt with by using the extend prefix notation as expected. The switch is passed as the first argument to the combiner followed by the operand to be pushed onto the stack.
(lookupswitch 
    (10 label1
     20 label2
     30 label3
    :else dftl)
    num)        
This operation could be dealt with instead by a case statement, but the purpose of this language is to hide nothing, and no operation in the instruction set from the programmer. This produces a complex instruction which looks something like this.
iload_0
lookupswitch 
 10 : label1
 20 : label2
 30 : label3
 default : dftl
As can be clearly seen here, the switch instruction still corresponds to the bytecode generated. Table switches can be automatically recognized and generated for you if you use an interval of keys or they can programmed directly.

Method calls:
The syntax to support method calls is not that interesting, Clojure already has a syntax that works well enough so there is no need to change it. The only issue is that the implementation will need to do a bit of reflection to determine the right kind of instruction to use, but once that is done the method call is translated directly to a sequence of instructions pushing values onto the stack followed by the correct invoke instruction.
(java.lang.Math/max 0 10)
(.charAt "string" 0)
The following produced assembly code demonstrates how these function calls correspond directly to invoke instructions on the java virtual machine.
iconst_0
bipush 10
invokestatic java.lang.Math/max(II)I
ldc "string"
iconst_0
invokevirtual java.lang.String/charAt(Ljava/lang/String;I)C
The benefit of using this notation is that this follows the same pattern of all the combiners defined so far, values are pushed onto the stack one by one and then the instruction is determined by the prefix. As mentioned previously, high level programmers might not realize that when programming Java the instance a method is being applied on is actually an operand pushed onto the stack.

No comments:

Post a Comment