Machine code generation
Lecture 17
Table of Contents
Machine code generation
- Convert intermediate code to assembly
- LLVM architecture
No longer need to worry about SimpleC semantics at all.
Intermediate code
- Closer to assembly, easier to translate
- What makes TAC different from assembly?
X86 assembly primer (AT&T style)
Operands
- Immediate:
$1
- Register:
%rax
- Memory: 0xdeadbeef
- Register indirect (pointers):
(%rbp)
- %rbp is a register that holds an address
- Register indirect plus offset:
-32(%rbp)
- Get value at address in %rbp minus 32 bytes
Memory layout for SimpleC
- One memory address per variable (including temps)
- Stored in function stack frame (will go over functions next time)
- https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/
%rbp
is the base pointer
- Access stack frame memory with register indirect
-32(%rbp)
is address 32-bytes into the stack frame
Opcodes
CONST
CONST _t0 1
movq $1, -32(%rbp)
Assuming _t0 is already allocated to stack frame
ASSIGN
ASSIGN true _t0
mov -32(%rbp), %rax mov %rax, -8(%rbp)
Assuming true and _t0 are already allocated
How else can this be implemented?
Can use later optimization step to reduce instructions, use faster instructions, etc.
Arithmetic operators
SUB _t5 x _t4
mov -24(%rbp), %rax mov -64(%rbp), %rcx sub %rcx, %rax mov %rax, -72(%rbp)
All temp and locals are in the stack frame in memory. In this example we move all variables to registers before operating on them and back into memory afterwards.
Are there anyways to make this more efficient? Will they work for any given SimpleC program?
Labels and branching
GOTOLE _l2_main x _t2 ... LABEL _l2_main ...
mov -24(%rbp), %rax mov -48(%rbp), %rcx cmp %rcx, %rax jle _l2_main ... _l2_main: ...
Intel architecture has a flags register that gets tested when using jump (jXX) ops.
Example program
SimpleC
main() { int x; input x; while(x > 0) { x = x - 1; } return 0; }
Intermediate code
[main CONST _t0 1 ASSIGN true _t0 CONST _t1 0 ASSIGN false _t1 INPUT x LABEL _l0_main CONST _t2 0 GOTOLE _l2_main x _t2 CONST _t3 1 GOTO _l3_main LABEL _l2_main CONST _t3 0 LABEL _l3_main GOTOZE _l1_main _t3 CONST _t4 1 SUB _t5 x _t4 ASSIGN x _t5 GOTO _l0_main LABEL _l1_main CONST _t6 0 RETURN _t6 ]
Assembly code
.text .globl main .type main, @function main: push %rbp mov %rsp, %rbp sub $96, %rsp movq $1, -32(%rbp) mov -32(%rbp), %rax mov %rax, -8(%rbp) movq $0, -40(%rbp) mov -40(%rbp), %rax mov %rax, -16(%rbp) call input_int64_t@PLT mov %rax, -24(%rbp) _l0_main: movq $0, -48(%rbp) mov -24(%rbp), %rax mov -48(%rbp), %rcx cmp %rcx, %rax jle _l2_main movq $1, -56(%rbp) jmp _l3_main _l2_main: movq $0, -56(%rbp) _l3_main: mov -56(%rbp), %rax cmp $0, %rax jz _l1_main movq $1, -64(%rbp) mov -24(%rbp), %rax mov -64(%rbp), %rcx sub %rcx, %rax mov %rax, -72(%rbp) mov -72(%rbp), %rax mov %rax, -24(%rbp) jmp _l0_main _l1_main: movq $0, -80(%rbp) mov -80(%rbp), %rax jmp _main_return _main_return: mov %rbp, %rsp pop %rbp ret