Wednesday, January 9, 2019

Register allocation through graph coloring

Instruction set architectures can be either register based or stack based. In general, register based machines are more low level, and they have a certain set of registers that variables can be stored in, otherwise they will be spilled into memory. Register allocation is considerably important for at least two reasons, first of all because no significant high level programming system exposes registers to the programmer. Generally, a high level architecture is based upon stacks which abstract registers away, necessitating allocation. Secondly, even among register machines there is a great deal of differences between the register files they have, the number of registers, their size, and so on, so that register allocation is needed for portability. The ideal today is to have a portable high level stack machine architecture (like the JVM or the CLR) which has both properties.

The issue is really how to pass operands to functions. Registers place special conditions on how operands can be passed to functions, where as stacks do not. A register based system of operand passing makes it so that you have to constantly consider where things are being allocated. Lisp requires a high level abstraction of operand passing, which is why it has always been so effective on top of a stack machine architecture. Any stack machine architecture can be made immediately amenable to Lisp, by making push instructions into atomic forms, and making other instructions into functional combiners. Fortunately, the problem of register allocation has an effective solution.

In order to proceed with dealing with this problem, we will start with a set of variables and we will perform liveness analysis in order to form an interference graph. The problem of register allocation can therefore be reduced to a problem of graph coloring, which we can deal with effectively using our understanding of graph theory. A coloring of a graph in this context is the same thing as a partition of into independent sets. The independent sets are registers to which values can be allocated to without effecting the other variables. One special case is that of an SSA (like the LLVM) for which the interference graph is chordal. Chordal graphs are a familiar class of graphs with only polynomial time coloring, so that makes register allocation even more readily computable.

Since register allocation is a graph coloring problem necessary to program in any register machine, you generally shouldn't program in the assembly language of a register machine like X86 or ARM unless you are a compiler developer. The register allocator and instruction optimizer can produce assembly code for these architectures better then or at least as good any individual developer can. On the other hand, it is sometimes acceptable to program in a high level stack based assembly language, as the language is sufficiently high level to be easily dealt with. Then you could understand a program all the way down to its assembly.

No comments:

Post a Comment