Code Generation: Variables, Assignment, and Expressions
Lecture 10
Table of Contents
One approach to implementation
Program Development by Stepwise Refinement, Niklaus Wirth
- the per-component approach
- waterfall method leads to pain (in this and other large, complex software)
- if you are trying code everything before you even build much less test, you are setting yourself up for a lot of time and pain
- creating a minimal test
- checking reachability
- waterfall method leads to pain (in this and other large, complex software)
- the stepwise refinement approach
- get the spec, use comments to plan your code (spent no time actually coding yet; building a house, you wouldn't start with the roof; planning is key; take time to make it easy, constructive laziness)
- take a step and break it down even more
- use references to figure out how to do each piece
- put the pieces together to solve the step
- move on to next step
- test with the minimal test, create more tests
- paper, wirth (recently passed away)
static void check_unaryexpr(T_expr expr) { // 1. check the type of the subexpression (the thing being deref) check_expr(expr->unaryexpr.expr); // expr->unaryexpr.expr->type switch(expr->unaryexpr.op) { case E_op_ref: // 2. create new pointer type to the type of the subexpression, e.g., pointer<int> T_type subexpr_type = expr->unaryexpr.expr->type; expr->type = create_pointertype(subexpr_type); break; case E_op_deref: // 2. make sure it's a pointer type (otherwise a type error) if (expr->unaryexpr.expr->type->kind == E_pointertype) { // 3. unbox the type, i.e., get the type that is being pointed to T_type unboxed_type = expr->unaryexpr.expr->type->pointertype; // 4. update the type of the unaryexpr that we just checked expr->type = unboxed_type; } else { type_error("tried to dereference something that wasn't a type"); } break;
X86 instructions
Generating assembly in our compiler
#define PUSH(arg1) fprintf(codegenout, "\tpush\t%s\n", arg1 ) #define POP(arg1) fprintf(codegenout, "\tpop\t%s\n", arg1 ) #define MOV(arg1, arg2) fprintf(codegenout, "\tmov\t%s, %s\n", arg1, arg2) #define MOV_FROM_IMMEDIATE(arg1, arg2) fprintf(codegenout, "\tmov\t$%d, %s\n", arg1, arg2) #define MOV_FROM_OFFSET(offset, reg) fprintf(codegenout, "\tmov\t-%d(%%rbp), %s\n", offset, reg) #define MOV_TO_OFFSET(reg, offset) fprintf(codegenout, "\tmov\t%s, -%d(%%rbp)\n", reg, offset) #define MOV_FROM_GLOBAL(reg, global) fprintf(codegenout, "\tmov\t%s(%%rip), %s\n", global, reg) #define MOV_TO_GLOBAL(reg, global) fprintf(codegenout, "\tmov\t%s, %s(%%rip)\n", reg, global) #define ADD(arg1, arg2) fprintf(codegenout, "\tadd\t%s, %s\n", arg1, arg2) #define SUB(arg1, arg2) fprintf(codegenout, "\tsub\t%s, %s\n", arg1, arg2) #define SUBCONST(arg1, arg2) fprintf(codegenout, "\tsub\t$%d, %s\n", arg1, arg2) #define IMUL(arg1, arg2) fprintf(codegenout, "\timul\t%s, %s\n", arg1, arg2) #define CDQ() fprintf(codegenout, "\tcdq\n") #define IDIV(reg) fprintf(codegenout, "\tidiv\t%s\n", reg) #define CALL(arg1) fprintf(codegenout, "\tcall\t%s\n", arg1) #define RET fprintf(codegenout, "\tret\n") #define COMMENT(arg1) fprintf(codegenout, "\t# %s\n", arg1)
Tips for each generator
codegen_func
Handling scope
See how main does it
Inserting offsets into the symbol table
insert_offset(current_offset_scope, func->paramlist->ident, 8);
Prologue
emit_prologue(current_offset_scope->stack_size);
Copying parameters to memory
int offset = lookup_offset_in_scope(current_offset_scope, func->paramlist->ident); MOV_TO_OFFSET("%rdi", offset);
Generate code for declarations
Generate code for statements
Epilogue
emit_epilogue();
codegen_decllist
insert_offset(current_offset_scope, decllist->decl->ident, 8);
codengen_assignstmt
Popping from the stack
POP("%rax");
- Looking up the offset:
lookup_offset_in_scope
Moving result to stack
MOV_TO_OFFSET("%rax", offset);
codegen_identexpr
- Look up offset in scope as usual
Moving variable value from stack
MOV_FROM_OFFSET(offset, "%rax");
Pushing intermediate values onto the stack
PUSH("%rax");
codegen_callexpr
Pushing the parameter onto the stack
POP("%rdi");
Calling the function
CALL(expr->callexpr.ident);
- Pushing an intermediate value onto the stack as usual
codegen_intexpr
Setting an immediate value:
MOV_FROM_IMMEDIATE((int) expr->intexpr, "%rax");
- How can we push the intermediate value onto the stack
codegen_charexpr
Characters are equivalent to integers in our implementation.
codegen_unaryexpr
Only needed for project 4
codegen_binaryexpr
How do we generate code for the left and right operands?
Recursively call codegen_expr on them, e.g,
codegen_expr(expr->binaryexpr.left);
How do we pop the value off the stack?
Be sure to use two different registers, e.g.,
%rax
and%rbx
Pick the right operation
Addition is ADD("%rbx", "%rax");
Other operators: SUB, IMUL
Division and mod are slightly different
case E_op_divide: COMMENT("do the division"); CDQ(); IDIV("%rbx"); // quotient is in %rax break; case E_op_mod: COMMENT("do the remainder"); CDQ(); IDIV("%rbx"); // remainder is in %rdx MOV("%rdx", "%rax"); break;
Simple register allocation, all results go to rax and get pushed on to the stack.
Using GDB
Use gdb to step through your simplec output program. First, install it with
sudo apt install gdb
Clone and install this useful gdb assistant called peda. Make sure you have already compiled your simplec program as shown in "Using your compiler" above. Then step through the program like so:
gdb a.out set disassembly-flavor att # once inside of gdb b main # set a breakpoint at main run # start the program. it will wait at main si # step through each assembly instruction # continue stepping through to track the behavior
If you've downloaded and installed peda, you will see the assembly code, registers, and stack displayed after each step.