Function Implementation
Compiler Implementation
COP-3402 COP-3402
What are functions?
How do functions implement local state?
Application Binary Interface (ABI)
Binary (machine code) conventions for implementing functions
Conventions
- Stack frame layout
- Parameters
- Return address
- Base pointer
- Locals
- Parameter passing
- Return values
System/architecture specific ABIs
System V AMD64 ABI
Eli Bendersky, Stack frame layout on x86 64
- Specific to System V (many *nix systems)
- Specific to AMD64 (x86 64) instruction set architecture (ISA)
Calling convention example
# assembly file metadata .file "stdin" .section .note.GNU-stack,"",@progbits .text
# function label .globl main .type main, @function main:
# prologue (update base pointer, save rbx) pushq %rbp movq %rsp, %rbp push %rbx
call func
# set return value mov $10, %rax
# epilogue (restore stack pointer, restore base pointer) pop %rbx mov %rbp, %rsp pop %rbp ret
Stack frame management
(Diagram)
Show what happens to the stack and register states changes after each instruction.
main: # prologue pushq %rbp # save old base ponter movq %rsp, %rbp # set new base pointer push %rbx # %rbx is callee-saved # call call func # return value mov $10, %rax # epilogue pop %rbx # restore rbx for the caller mov %rbp, %rsp # restore old stack pointer pop %rbp # restore old base pointer ret
func: # prologue pushq %rbp # save old base ponter movq %rsp, %rbp # set new base pointer push %rbx # %rbx is callee-saved # return value mov $5, %rax # epilogue pop %rbx # restore rbx for the caller mov %rbp, %rsp # restore old stack pointer pop %rbp # restore old base pointer ret
Language processing review
Syntax trees
- Explicit names for syntactic constructs, e.g.,
- Arithmetic expression
- For-loop
- etc.
Language implementation
- Tree traversal
- Apply rules to translate/interpret
- One for each syntactic construct
Infix-to-postfix translator
Assume we have string concat
function to join any number of strings.
E -> E + E { E.value = concat(E1.value, E2.value, "+") } E -> E - E { E.value = concat(E1.value, E2.value, "-") } E -> E * E { E.value = concat(E1.value, E2.value, "*") } E -> E / E { E.value = concat(E1.value, E2.value, "/") } E -> N { E.value = N.value } N -> 0 { N.value = "0" } N -> 1 { N.value = "1" } N -> 2 { N.value = "2" } ... N -> 9 { N.value = "9" }
Note that we no longer need to convert ascii to a machine integer, since we can just print the output program in ascii as well.
CodeGen with an ANTLR Listener
Syntax tree for SimpleIR
(Diagram)
Draw tree for
function main phonyvar := call func return 10
# assembly file metadata (enterUnit) .file "stdin" .section .note.GNU-stack,"",@progbits .text
# function label (enterFunction) .globl main .type main, @function main:
# prologue (enterFunction) pushq %rbp movq %rsp, %rbp push %rbx
# enterCall call func
# set return value (enterReturn) mov $10, %rax
# epilogue (exitFunction) pop %rbx mov %rbp, %rsp pop %rbp ret
Note that the return statement doesn't call return. That's done in/after the epilogue. Return sets the value of the return register, i.e., %rax
.
enterUnit
- Generate file pseudo-ops
- (Given for the project)
enterFunction
- Generate function label
- Generate prologue
enterCall
- Generate function call
exitFunction
- Generate epilogue (and return instruction)
Code generation
(Diagram)
- Use syntax tree for main.ir
- Apply
Example: enterUnit
- Function writes out assembly code
Tips
- Remember: we are writing a translator not trying to run the IR
- Code generator prints out strings (
self.outfile.write("...")
) - Use the text from the input program to fill in assembly language templates (
ctx.NAME()
) - The grammar gives names to the parts of the syntax tree