Compiler Backend
Ola compiler backend bridge IR structure of module, function and Instruction Set Architecture(ISA) related callconv, registers and instructions. Its main features are code generation related lower and optimization related passes.
Target ISA mainly contains custom target instructions, registers which contain register class and register information, call convention and datalayout information.
Module in addition to inheritance IR is parsed out as Module structure, the description of its function and differ significantly with LLVM IR. Data information of Basic Block (BB) in the instructions target for the instruction, the register contains VRegs (Virtual Registers) and RegUnit (Register Unit) two categories, and contains a VRegs to Instructions (Insts) mapping. At the same time, instruction in data layout is referred to target instruction. Note that the structure of slot which contains stack base pointer and stack offset describe the memory access operations of parameters, local variables.
The lower provides the process of downgrading IR instruction to target instruction. Specifically it also requires copy parameters to VRegs for function call.
The pass module contains the register allocation (RegAlloc, RA) and spiller for analyzing the liveness of the pass and the function pass.
The backend of the Ola compiler compiles IR into target assembly code. It takes the standard LLVM IR generated by the frontend as input and the Ola assembly code as output.
Its pipeline process is as follows figure
LLVM IR Parser
The role of IR parser is to parse LLVM IR to Module Instruction. Its parsing briefly process is as follows:
Parser parse target DataLayout and Triple, the result is target data information.
Parser parse attribute group, the result is attribute information of module.
Parser parse local types in module, the result is registered type in module.
Parser parse global variables, the result is global variables symbol table.
Parser parse function which is mainly divided into arguments list and function body, the result is function structure in module instruction.
Parser parse metadata, the result is metadata map in module.
Optimizer: Optimization Passes on Parsed IR
Usually there are two kinds of compiler Optimization (Opt) passes, one is analysis passes and the other is transform passes. Currently our analysis pass is mainly Dominator Tree analysis pass, while transform passes contains Dead Code Elimination (DCE), Promote Memory to Register (Mem2Reg), Sparse Conditional Constant Propagation (SCCP).
Register and Instruction
The register description is as follows:
ABI Lower: Lowering Function Call
Ola Procedure Call Standard (OPCS) are as follows:
The stack initialization points to the first address of the frame stack after the
fp
register is loaded.The address will be increased when the
call
instruction is executed later.When the
ret
instruction is executed, thefp
register points to the address and falls back.
The Calling process is as follows:
call label
Caller uses
call
instruction to call a callee ascall functionLabel
, andfp
points to the new frame. Thepc
address returned by the callee is placed infp-1
which is detected by VM but not visible by the compiler backend. Its instructions pattern are as follows:function address
The address pointed to by
fp
before the function call is placed infp-2
asmstore [r8,-2] r8
. Its instructions pattern are as follows:passing arguments
Function parameter processing: the first three input parameters are placed in the three registers
r1
,r2
, andr3
. If there are more than 3 parameters, start with the fourth input parameter and descend accordingly infp-3
,fp-4
,...
. Its instructions pattern are as follows:local variables
Local variables inside the function start at old
fp
, and their addresses are stored incrementally.The single return value is stored in
r0
. If there are multi return values, it needs to be returned by a memory pointer that return the package data. Instruction pattern for single return value is as follows:
The call stack frames layout is as follows (due to the limitations of Markdown, you cannot directly embed images, but you can link to them):
For prophet library functions, its instructions pattern as:
First
.PROPHET
label binds to the prophet instance in assembly output.Then the program interacts with prophet read-only memory, get the return value from prophet pointer
[psp]
and write the result intor0
.At last, we use
r0
as indexed addressing to load return values from prophet memory.
Please note that Markdown doesn't support referencing figures with labels as LaTeX does, so you'd typically just describe the figure and provide a link to it or insert it directly if the platform allows for image embedding.
Instruction Selection Pattern
Conditional Branch Selection Pattern
Slot Elimination
This pass handles the stack slot for local variables.
Its pipeline is as follows:
Register Allocation and Coalescing
Register allocation use linear scan method, its briefly steps as follows:
we analyze liveness in function, for input and output find live in and live out.
we insert spill and reload code, push it to worklist.
we rewrite the virtual register for the target register.
While the steps for register coalescing is as follows:
We traverse the
movrr
target instructions at basic block of function on the module.If the two registers of operands are the same, then we push the instructions into the work list.
We can then remove the instructions in the work list from the function.
Assembly Printing
The basic format of the Ola assembly language is as follows:
Symbol indicates a symbol, which must start at the beginning of the line.
Instruction indicates an instruction, it is usually preceded by two spaces.
Directive indicates a pseudo operation.
Pseudo instruction means a pseudo instruction.
Directives, pseudo operations, and pseudo instruction helpers are all case-sensitive, but cannot be mixed.
Assembly Instructions
For simplicity, pseudo operations and pseudo instructions like .global
are not considered for now. Function entries that start with funcName
and end with :
are treated as a label. For example, main:
defines a label for a function named main.
Note: The symbols that usually start with .
indicate pseudo directives or pseudo operations, such as different segments. Symbols ending with :
indicate labels, such as function names and basic block numbers.
Instruction Format
The format of the internal assembly instruction is in the form of a three-address code:
Opcode
indicates the instruction helper, usually the instruction helper defined by OlaVM.Rd
indicates the instruction operation destination register, which is usually the register defined by OlaVM.Rn
indicates the first source operand of the instruction, usually a register defined by OlaVM.shifter_operand
indicates the instruction data processing operand, usually an immediate or register defined by OlaVM.
Memory Layout
After the program is loaded, pc
points to the zero address and the function stack frame is switched according to the hierarchy of function calls, and the memory address stack grows from a low address to a high address. When prophets are present in the program, an indexed addressing register is required to interact with the prophet memory.
Last updated