Project 4
Table of Contents
Overview
In this project, implement support for pointer variables and for control-flow constructs.
This project only needs to implement a subset of the simplec language:
- there is only one level of indirection for pointers, no need for pointers to pointers.
- there is no need to support type casting.
- no need to support arrays
- only need to support and
&&
, or||
, not!
- equality
==
and less than<
are bonus exercises
- equality
It is left as a challenge for bonus points to implement more features of the language, such as multiple levels of pointers, array allocation and indexing, etc.
Project 3 and 4 can be done separately, except for the assignment statement, which is supports both assignment to variables (project 3) and assignment to dereferenced variables (project 4).
Getting started
- Accept the GitHub classroom assignment in webcourses to create your git repository for the project.
- Submission will be via GitHub by committed and pushing all of your changes to your repository to the course GitHub assignment, which graders will use for grading.
In your virtual machine, clone your repository as described in lecture, replacing
USERNAME
with your GitHub username.git clone https://github.com/cop3402/simplec-compiler-4-USERNAME
- Run
make
in your source folder (in your virtual machine) to build the project, which should create thesimplec
program.- If the build fails, double-check that you are in the repository directory and on a correctly-configured virtual machine.
- Run
cat tests/example.simplec | ./simplec
- Running
./simplec
without providing input will cause the program to stop and wait for input tostdin
. Use Ctrl-D to end the input (or Ctrl-C to terminate the program). - If
./simplec
is not found, be sure to prefix the program with./
, and be sure themake
step above succeeded.
- Running
Submitting your project
Submit your project by commiting your changes and pushing them to your GitHub repository.
Using your compiler
# this builds your compiler make # this runs your compiler to compiler a simplec program cat if.simplec | ./simplec > if.s # this assembles and links the simplec program into an executable gcc if.s # produces a.out. use -o NAME to give your executable a better name ./a.out echo $? # prints the return value of the program
Debugging your compiler
- Narrow down the problem by crafting a smaller version of the test case
- Go to the part of the your code that is related to the problem
- Trace step-by-step what your code really does (not what you think/hope/feel/guess/divine/intuit/reckon it does)
Testing
Be sure to unit test as you go.
Use gdb to step through your simplec output program. First, install it with
sudo apt install gdb
Clone and install this useful gdb assistant called peda. Make sure you have already compiled your simplec program as shown in "Using your compiler" above. Then step through the program like so:
gdb a.out set disassembly-flavor att # once inside of gdb b main # set a breakpoint at main run # start the program. it will wait at main si # step through each assembly instruction # continue stepping through to track the behavior
If you've downloaded and installed peda, you will see the assembly code, registers, and stack displayed after each step.
Examples
Examples of these constructs are available in the tests/
folder of the project repo.
Pointers
- pointer.simplec
- pointer.s
- pointer2.simplec
- pointer2.s
- pointer_arithmetic.simplec
- pointer_arithmetic.s
Control-flow
- if.simplec
- if.s
- ifelse.simplec
- ifelse.s
- while.simplec
- while.s
- while2.simplec
- while2.s
Code generation rules
Type cast
For this project, just pass through any value on the stack, i.e., just do codegen_expr(expr->castexpr.expr);
.
It is left as a bonus challenge to implement different bitwidths for different types and do the appropriate conversion for the type cast.
Unary reference operator &
this operation gets the address of the variable, rather than its contents. to do so, we need to get the offset and subtract that from the base pointer and save that result in another register.
- only allow an identexpr to be referenced, i.e., make sure that the
expr->unaryexpr.expr->kind
is anE_identexpr
- save the base pointer to another register, e.g., rax
- like codegen_identexpr, lookup the identifier's offset in scope
- compute the address by subtracting the offset from the base pointer copy in rax
- save the result by pushing it onto the stack
Reading from dereferenced pointer with a unary dereference operator
- generate code for the expression in the derefernce expression
expr->unaryexpr.expr
- retrieve the value by popping it from the stack
- use memory indirection addressing to mov the contents of the memory address computed in the dereference expression to a register, e.g.,
mov (%rax), %rbx
- save the result onto the stack
Assigning to dereferenced pointer in an assignment statement
- generate code for the deref on the left side of an expression
- generate code for the right-hand side of the assignment
- pop the result of the right-hand side
- pop the result of the left-hand side
- move the right-hand side's value into the address pointed to by the left-hand size using memory indirect addressing, e.g.,
mov %rax, (%rbx)
Control-flow constructs and Boolean operators
See Lecture 12
Relational operators
Operators
==
<
These work just like Boolean operators, except using different jump instructions.
See the reference to pick the right one. Note that the code may be easier to write by testing for the opposite case, e.g., test for >=
for <
to jump when the condition does not hold (just like for if statements).