COP 3402 meeting -*- Outline -*-

* Code Generation

     We'll concentrate on generating code for the SSM,
     as that is what we are using in class this semester

** overview
     These notes are based on
     Appel's book "Modern Compiler Implementation in Java",
     chapter 7 (Cambridge, 2002).

------------------------------------------
         OVERVIEW OF CODE GENERATION

 .. ASTs...-> [ Static Analysis ]
                     |
                     | IR
                     v
              [ Code Generation ]
                     |
                     | Machine Code
                     |
                     v
               SSM Virtual Machine
                  Execution

The IR (= Intermediate Representation)
    records


------------------------------------------
         ... information from static analysis
             including attributes of names used

*** IR (Intermediate Representation)

         We're going to focus on the translation from IR to Machine Code
         (circle that) and the differences between ASTs and IR

         Q: What kind of information is needed from a name's use
            in order to generate code?
            Its lexical address

         Q: Should the parser create the lexical address of a name's use
            during paring?
            No, that needs information that is more readily available
            during static analysis (from the symbol table).

         Q: Is the symbol table unchanging (immutable)?
            No, it is updated as scopes are entered and left...
            So is it convenient to recreate during each pass?
            No, want to store the information for each name in the IR


------------------------------------------
            IR TREES

An IR is a tree structure,


Helps in modularizing compilers
and code generation

   WITHOUT IR               WITH IR

Java ------>x86       Java        ->x86
     \/ |||                \     /
   C ------>MIPS         C \\   /-->MIPS
     \\/ ||                 ->IR
 C++ ------>Sparc      C++ //   \-->Sparc
      \\\/                 /     \
  C# ------>A1          C#/       \>A1

------------------------------------------
    ... somewhat like an AST,
        but a kind of abstract machine code,
        with information needed
        for code generation

    ... draw lines on left from each language to each machine
        and on right from each language to IR and from IR to each machine

        The advantage is that with the IR, each language compiles to
        the same IR and there only has to be one code generator built
        for each machine architecture

------------------------------------------
          OUR CHOICES FOR AN IR

To keep things simple, we will use
   a modified AST type as an IR

Parser:
   - records
   - provides


Static analysis:
   - records


------------------------------------------
    Making the IR like the ASTs puts more work on the code generator,
     an IR that is more like Machine Code would be translated partway
     by the static analysis phase
     (or that idea could be used internally in code generation)

    ... structure of programs
    ... placeholders for attributes
        needed for code generation

    ... attributes of names used (after finding them)
          in the IR
        We add id_use pointers to all places in the AST where an
          identifier can be used
            (or declared, since the attributes are also in the id_use d.s.)
            

*** General strategy
------------------------------------------
   GENERAL STRATEGY FOR CODE GENERATION

Don't try to optimize!


Follow the grammar of


Think about the invariants!


Trust the recursion!

------------------------------------------
      ... (there would be a separate pass for that)
         instead of optimizing,
         look for simplest translation that could work

      ...
        the ASTs (abstract syntax)
        doing a tree walk,
        generating IRs

        since our IR is basically like ASTs with
          some symbol table information
          we'll do this in scope checking.

      ...
        invariants are properties that should always be true.

        1. after expression evaluation, the size of the stack has
        increased by just 1 word and the "top" of the stack has the
        expression's result.

        2. Every AR has a specified structure

        Trust the recursion!
              but keep the recursion simple
------------------------------------------
       FOLLOWING THE GRAMMAR

Code resembles the grammar that


When


------------------------------------------
     ...  describes the input data
          (in this case, ASTs)

     ... the grammar is recursive,  (i.e., when the data structure is recursive)
         the code is recursive

     ... the grammar has alternatives,
         the code has conditionals (or a switch)

     Q: How does this relate to the parser?
        Our code for parsers followed the grammar in this way.
     Q: Why is this useful?
        - can see that all possible inputs are covered (all cases)
        - coding responsibilities are clear
          (e.g., which functions process which inputs)

** Translation target: code sequences
------------------------------------------
         TARGET: CODE SEQUENCES

Need lists of machine code

Why?


------------------------------------------
        ... instructions

        ... because most constructs in the language
            need more than one instruction to accomplish
            (and that is a good thing!)

------------------------------------------
           THE CODE TYPE

// file code.h
#include "instruction.h"

// machine code instructions
typedef struct code_s {
    struct code_s *next;
    bin_instr_t instr;
} code;

// Code creation functions below
// with the named mnemonic and parameters
extern code *code_nop();
extern code *code_add(reg_num_type t,
                      offset_type ot,
                      reg_num_type s,
                      offset_type os);
extern code *code_sub(reg_num_type t,
                      offset_type ot,
		      reg_num_type s,
                      offset_type os);
// ...

------------------------------------------
        The bin_instr_t type is from the VM implementation's
             instruction module

        There are functions to create code structs and allocate them
        on the heap, for every instruction in the SSM's ISA

        Q: Why use that instead of just bin_instr_t?
           So we can put code objects in lists...

------------------------------------------
     REPRESENTING CODE SEQUENCES IN C

// file code_seq.h
#include "code.h"

// code sequences
typedef code *code_seq;

extern code_seq code_seq_empty();

extern code_seq code_seq_singleton(
                                 code *c);

extern bool code_seq_is_empty(
                            code_seq seq);

// Requires: !code_seq_is_empty(seq)
// Return the first element...
extern code *code_seq_first(code_seq seq);

// Requires: !code_seq_is_empty(seq)
// Return the rest of the given sequence
extern code_seq code_seq_rest(
                            code_seq seq);

// Return the size (number of words)
extern unsigned int code_seq_size(
                            code_seq seq);

// ...

// Requires: c != NULL && seq != NULL
// Modify seq to add the given
//   code *c added to its end
extern void code_seq_add_to_end(
                 code_seq *seq, code *c);

// Requires: s1 != NULL && s2 != NULL
// Modifies s1 to be the concatenation
// of s1 followed by s2
extern void code_seq_concat(code_seq *s1,
                            code_seq s2);

// ...
------------------------------------------
        Q: Why are code sequences needed?
        To be run by the VM, executing one instruction after another
        (and this is sequential execution)

        I'm using a linked list to represent code sequences,
           but other data structures would be possible.

           typedef struct {
               code *first;
               code *last;
           } code_seq;


** Designing Code Sequences
*** Overall strategies
------------------------------------------
 STRATEGIES FOR DESIGNING CODE SEQUENCES

Work backwards


------------------------------------------
     ... Work backwards,
        starting with the ultimate instruction
             you want to use,
        and then figure out how to produce that
        and code that prepares for that
        and then (optionally)
           code to compensate for that

------------------------------------------
       EXAMPLE: EXPRESSION EVALUATION

Example: (E1 + E2) - (E3 / E4).


Constraints:
 - Expressions have a result value
 - Binary operations (+, -, *, /)
   in the SSM


Where should the result be stored?

  Can it be a register?


------------------------------------------
... need one operand on the stack

    Suppose we try to always keep the result of an expression, E1, in a register,
       that means we reserve one register, say r1,
       for the result of E1, and r1 can't be used for other expression
       evaluation, since that would destroy E1's value (in r1).

... no, there are only a finite number of registers
    (7 in the SSM),
    and an expression (like E2 op E3)
    can have arbitrarily many subexpressions.

    So can't reserve 1 register per subexpression

    Q: Can expressions have (side-)effects in the programming language?
       No! (not in SPL)
    Q: Can the order of evaluating expressions matter for the language?
       No! (not in SPL)
       So, we can evaluate them in any order.
       
    Thus, we want to use the runtime stack to store expression values.
    (So no register is ever reserved for an expression's value.)
    Every expression's result goes on the stack,
        e.g., to evaluate E2 op E3:
               [evaluate E3, pushing its value, v3, on the stack]
               [evaluate E2, pushing its value, v2, on the stack]
               [instruction to compute v2 op v3,
                putting its value on top of stack]
       Since this stack-based evaluation
           reserves no registers for expression values,
           it can evaluate arbitrarily complex expressions.

    Q: What would a compiler do for a register-based ISA?
       pop the values off the stack, put them in registers,
       do the computation, storing the result in a register,
       then push the result onto the top of the stack.

    (Another alternative, used in production compilers and LLVM,
       imagine there are an infinite number of registers,
          r1, r2, r3, ...
       evaluate each expression, storing its result in a (reserved) register.
       Suppose this evaluation uses M registers (one for each subexpression):
               and the machine has N registers available
       Then:
          - reserve 2 registers for binary operation evaluation (say r1 and r2),
          - assign the remaining N-2 registers to the first N-2 subexpressions,
             and for the rest use the stack, as above.)
       (There can be various ways to do this, e.g.,
         if there are constants, keep those where they are in storage
         until needed for binary operations.
         A compiler may have a separate pass (walk over the IR) to do this.)

------------------------------------------
         ADDRESSING VARIABLES

Consider an expression

          x

where x is a variable

How to get x's value on the top of stack?


------------------------------------------

     ... want to use CPW instruction
             so we need to use the AR's base address + an offset
         Q: Would this be different for constants?
            No, same as variables!
            
         How do we get the offset?
             From the identifer's offset in its AR
                    (available in the id_attrs
                      and we laid out the AR so that FP points to offset 0)
         Why do we want the AR's base address?
             that is where offets are computed from
         Where does the AR's base address need to be stored?
             in a register, as that's where CPW uses it from
           Which register?
             we should pick one, but it can't be FP, SP, or GP,
             (so, arbitrarily) let's use $r3 to start with.
         How do we compute the AR's base address?
                (in general, in SPL this is needed for nested blocks)
             Use the number of levels out from x's id_use
             start with the FP, that's base for 0 levels out,
                fetch the static link from each AR
                    for the number of levels out
             but we also need to decide what base register to use
               for this process. Let's say we'll use $r3 for now...

*** use of registers (omit)
------------------------------------------
 USE OF REGISTERS IN A REGISTER-BASED ISA

For a register-based ISA:

What if the target register
is already in use?
   e.g., in   x := y + z

Strategies:
 - use a different register


 - save and restore


------------------------------------------
     ... but will eventually run out of registers,
         so using a different register only works a bit

     ... save the register's value for when will need it later,
         after the other use is done, restore it, and continue
           (This works in general, as no code ever reserves
            a register)

*** strategy for expression evaluation
------------------------------------------
    GENERAL STRATEGY FOR EXPRESSIONS

Each expression's value goes


To operate on an expression's value


------------------------------------------
        ... on top of the runtime stack

        use the code module's function
            use instructions from the SSM

        ... may need to copy it to
              the top of the stack

*** Background on SSM instructions
------------------------------------------
      BACKGROUND: SSM INSTRUCTIONS

 ADD t,ot,s,os
            "M[GPR[t]+ot]
              = M[GPR[SP]] + M[GPR[s]+os]"

 SUB t,ot,s,os
            "M[GPR[t]+ot]
              = M[GPR[SP]] - M[GPR[s]+os]"

 MUL s,o    "(HI,LO)
             = M[GPR[SP]] * M[GPR[s]+ o]"

 DIV s,o    "HI
             = M[GPR[SP]] % M[GPR[s] + o]"
            and
            "LO
             = M[GPR[SP]] / M[GPR[s] + o]"

 CPW t,ot,s,os
            "M[GPR[t+ot] = M[GPR[s]+os]"

 CPR t,s    "GPR[t] = GPR[s]"

 ADDI r,o,i "M[GPR[r]+o]
             = M[GPR[r]+o] + sgnExt(i)"

What limitations on immediate operands?


What if the literal doesn't fit?


------------------------------------------
        Q: What does M[GPR[SP]] mean?
            the contents of the memory that the SP register points to
              (i.e., the top value on the runtime stack).

        Q: If the numbers are small enough, where is the result of a
           multiplication as a 32 bit integer located: HI or LO?
           in LO
           
        Q: How would you copy the value into location GPR[t]+ot
                from location GPR[s]+os?
           use CPW t,ot,s,os

        Q: Are there limitations on the immediate operands for ADDI?
           Yes immediate operands must fit into 16 bits
               (a short int in C, see table 2 on p. 8 of the SSM Manual),
           and since it's in 2's compliment format for the SRM
                it must be between -65536 and 65535 (inclusive).

        Q: What can you do if you want a constant value that doesn't fit?
               you can save it as global data (in the BOF) and copy it,

               so that is what we do in general, with the literal
               table.

** Literal Table
------------------------------------------
            LITERAL TABLE IDEA

- Store literal values in


- Keep mapping from


- Initialize


------------------------------------------
       ... data section (above GP) of BOF

       literal's text or value to a (word) offset in data section

       ... output code to initialize the data section
                       from the literal table.
           Those data go in the memory above GP
           from the BOF's data section (before running code)

        The idea is useful for data that doesn't fit into an immediate
        operand in the instructions needed
------------------------------------------
  LITERAL TABLE IN EXPRESSION EVALUATION

Idea for code for numeric expression, N:

    1. Look up N in literal table,
    2. Receive N's 


    3. generate a copy (CPW) instruction
         to copy that to 


------------------------------------------
     Q: What's our goal for expression code?
        Get the value onto the top of the runtime stack

     ... word offset (from $gp)
         in the global data section,
         call this offset

     ... the top of the stack (address in $sp)
           CPW $sp, 0, $gp, offset
         this can be done by:
           code_cpw(SP, 0, GP, offset)

------------------------------------------
    LITERAL TABLE AND BOF DATA SECTION

How to get the literals into memory
   with the assumed offsets?


------------------------------------------
        ... put them in the BOF file's data section
              in increasing order of the (word) offsets
            use bof_write_word() for each literal (in offset order).

** Activation Record (AR) Layout
    need to do this so can know how to address constants and variables
        in an AR.
        This is a convention, but code generation will always use it.

    Q: Where should constants and variables for a block be stored?
       on the runtime stack, so can handle recursion.

------------------------------------------
   LAYOUT OF AN ACTIVATION RECORD

Must save SP, FP, static link, RA
   and register $r3

Can't have offset of static link
    at a varying offset from FP

Layout 1:
                                   offset
  FP --> [  saved     SP          ] 0
         [  registers FP          ]-1
         [            static link ]-2
         [            RA          ]-3
         [ local constants        ]-4
         [      ...               ]
         [ local variables        ]
         [      ...               ]
         [ temporary storage      ]
   SP -->[      ...               ]


Layout 2:
                                   offset
         [      ...               ]
         [ local variables        ]
         [      ...               ]
   FP -->[ local constants        ] 0
         [  saved     SP          ]-1
         [  registers FP          ]-2
         [            static link ]-3
         [            RA          ]-4
         [ temporary storage      ]
   SP -->[      ...               ]

                                   offset
Layout 3:
         [ saved     SP           ] 4
         [ registers FP           ] 3
         [           static link  ] 2
         [           RA           ] 1
   FP -->[ local constants        ] 0
         [      ...               ]
         [ local variables        ]
         [      ...               ]
         [ temporary storage      ]
   SP -->[      ...               ]

Advantages of layout 1:


Advantages of layout 2:


Advantages of layout 3:


------------------------------------------
      Remember that the stack grows down towards lower addresses!

      Q: Why can't we put the constants and variables in front of
         the saved registers as an alternative to layout 1?
         Becuase the static link has to be at a statically known
         offset that is the same for all ARs!
      
      Q: What does the FP register address?
          a well-known place in each AR (see the different layouts)
          
       For simplicity, assume that offsets are determined by
       declaration order,
       Note that offsets are in numbers of words

       Q: What are the advantages of layout 1?
       ... - straightforward, fixed size subtracted
              from (negated) offset in symbol table
           - tracing easy for the VM,
              as it can show memory between FP and SP

      Q: What are the advantages of layout 2?
           - simplified offset calculations
              for declared constants and variables
           - variable addresses grow upwards (with positive offsets)
               (this would be better for arrays, although none in SPL)
                and that corresponds to a programmer's notions about layout
                  (e.g., C overflows of arrays would work as expected)
           - it could be easily expanded to save more registers
               (but that's may not be useful in the SSM)
           - offsets for most things are smaller than with layout 1
               (in absolute value) Why is that an advantage?
         How should VM do tracing?
           show everything between original FP base (from BOF file)
               and SP (the whole stack), which is good for nested scopes

      Q: Any disadvantages?
          The tracing already in HW1's VM would not show the constants
          and variables declared in the current AR
          (but that could be adjusted by showing everything from the
          original FP to the current SP)

      Q: What are the advantages of layout 3?
           - simplified offset calculations
              for declared constants and variables (just negate the offset)
           - it could be easily expanded to save more registers
               (but that's may not be useful in the SSM)
           - offsets for most things are smaller than with layout 1
               (in absolute value)
         How should VM do tracing?
           it's already mostly good,
           but it wouldn't show the saved registers
        How should VM do tracing?
           show everything between original FP base (from BOF file)
               and SP (the whole stack), which is good for nested scopes

      Q: Any disadvantages?
           - array element calculations would need to be negated
               (but no arrays in SPL,
                but if had arrays, would need to check for overflows,
                but that is good for security anyway)

      Q: Which layout should we use?
          I'll use layout 2 for lectures,
          (and the VM provided for HW4 has its tracing adjusted to show more)

** Declarations
   Q: Where are constants and variables stored?
     On the runtime stack, in the current activation's stack frame,
     (Should we want to do something different for global ones,
       like use the data section? -- no, simplest not to,
       but that could be an optimization.
       Or, all constants could be in the data section as an optimization.)

------------------------------------------
 TRANSLATION SCHEME FOR SPL DECLARATIONS

   const c = n;
   var x;

When do blocks start executing?


What should be done then?


How do we know how much space to allocate?


How to initialize constants?


How to initialize variables?


------------------------------------------
      Q: When are blocks executed in SPL?
         when the running procedure (or the main program)
         starts executing them

      Q: When starting to execute a block, what should be done?
         [allocate and initialize
          all declared variables and constants]
         [save the the necessary registers
          (SP, FP, static link, RA)]

      Q: Which should be allocated first: constants or variables?
         Want them in reverse order of declaration,
           so that the offsets saved in scope checking work,
           so the variables first, then the constants!

      Q: How do we know how much space to allocate?
         Have each declaration's code allocate its own space,
         and then process each declaration in sequence,
         so don't really need to know the total space
         (but that could be an optimization)

      Q: How to initialize constants?
           use the literal table's offset for the literal
           from the $gp register, store that into the stack.
           I.e., for a const_def of the form x = L, where L is at offset ofst
           from $gp
               [allocate one word on the stack]
               [L's value is copied to top of stack]
           that would be the following sequence:
                (SRI $sp, 1)
                (CPW $sp, 0, $gp, ofst)

      Q: How to compute the value of the constant?
         we'll always use the literal table, as that will always work
         (For a language with constant expressions, as in C/C++,
          static analysis would check the restrictions on such expressions,
          then in code generation, evaluate the expression
          and save the result as if it were a literal)

      Q: How to initialize variables?
           variables are always initialized to 0, so can use:
           [allocate one word on the stack]
           [zero out the top of the stack]
         this would be:
             (SRI $sp, 1)
             (LIT $sp, 0, 0)
         (Other code sequences might be faster,
         and could be used in optimization)

** Compiling Expressions
*** deciding where to start

    The best way to start is to handle the simplest cases first
        and expressions are already some of the simplest parts of the grammar

    Q: What are the simplest cases for expressions?
        literals and then variables
        (which we already discussed)

*** numeric literals
------------------------------------------
 TRANSLATION SCHEME FOR NUMERIC LITERALS


------------------------------------------

  ...

    - we will always use the literal table,
      suppose we do literal_table_lookup
      and it returns the offset of ofst
    - want to put it on top of the stack, so
        [allocate a word on the stack] 
        [copy value to top of stack]
      that can be as follows, if ofst is the literal's offset:
        (SRI $sp, 1)
        (CPW $sp, 0, $gp, ofst)

      we could optimize this if it fits into an immediate operand
         [allocate a stack location]
         [write the value into it]
       but we assume that a (peephole) optimizer does that,
          so we don't bother in code generation
          
      How much would that "optimization" save?
         (1 instruction, so 1 machine cycle)
     
    Does this mean we need to track offsets in the global data?
        Yes, that is the job of the literal table

*** variables as expressions

------------------------------------------
  TRANSLATION SCHEME FOR VARIABLE NAMES
           (AND CONSTANTS)


------------------------------------------

    want to use CPW to push the value onto the stack,
    but we need a base + offset for that
    - we will use $t3 as a frame register, so as not to disturb $fp

    # suppose lexical address of x is (levelsOut, ofst)

    # get base of x's stack frame into $r3
    [load FP into $r3 (??? $r3,$fp)]   
    [load next static link in $r3    }
      i.e., (LWR $r3, $t3, -2)]      }
      ..                             }     "levelsOut times"
    [load next static link in $r3]   }

    [push x's value using $r3 as base of its AR
      i.e., (CPW $sp, 0, $r3, ofst)]
    
    Q: How to load FP into $r3?
       Can write FP onto the stack, then read it into $r3.
       Use CPR 3, $fp

    Q: How would you generate code
       to repeat the loading of the next static link levelsOut times?
         Use a for loop in the compiler to build a code sequence
         That is done by code_utils_compute_fp
              (look at that in the provided code_utils.c file)


*** binary operator expressions
        binary operator expressions have subexpressions (recursive)
------------------------------------------
        TRANSLATING EXPRESSIONS

Abstract syntax of expressions in SPL

  E ::= E1 o E2 | x | n
  o ::= + | - | * | /


Simplest cases are:


------------------------------------------
     ... numeric literals (n)
         and variable and constant names (x, c)

       already discussed above,
       idea: evaluate the subexpressions onto the stack,
       operate on them with the appropriate instruction,
       putting the result on top of the stack
       
    Q: So, for E1 - E2 what needs to be done?
       [code to evaluate E2 onto the top of the stack] # recursive!
       [code to evaluate E1 onto the top of the stack] # recursive!
       [SUB $sp, 0, $sp, 1]
    Q: Why can we evaluate E2 first?
       Because there are no (side-)effects in expression in the language.

** Statements
*** Basic Statements

   Q: What are the base cases in the grammar for statements?
   I.e., what statements don't contain other statements?

     - (an empty block statement)
     - assignment
     - read
     - print
     - call  (we will not require this in hw4)

------------------------------------------
 TRANSLATION SCHEME FOR BASIC STATEMENTS


    begin end


    x := E


    read x


    print E


------------------------------------------
            ... the static link for a block will be the current FP
                 so
                   [(CPR, 3, FP)]
                   code_utils_save_registers_for_AR()
                   code_utils_restore_registers_from_AR()
                 Do we need a NOP instruction in this code sequence? No...
                   
            ... x := E
                  (Suppose lexical address of x is (levelsOut,ofst),
                     that information should be in the AST,
                     so need the scope_check module to put an id_use
                     in x's AST.)
                  (Overall we put E's value on the top of the stack,
                   and then save that using $r3 for the frame pointer
                   so it's:)

                [code to eval E onto the top of the stack]
                [code to load base of x's scope's frame into $r3]
                [CPW $r3, ofst, $sp, 0] # store E's value into x

                Does it need to be in that order?  Yes, consider x := y+1

            Q: For testing, want to know: What are the simplest cases?
                x is declared in the current scope (0 levels out)
                E is a literal
                   (so implement that case for expressions,
                    using the literal table)

            Q: In general, can the "levels outwards" part of
                the lexical address be determined when the variable
                is declared?
                No, it depends on the nesting of the use,
                  but can get it from the variable's id_use,
                  so the scope_check function will put that in the ASTs
                  (Thus the ASTs will have that compared to HW3)

            Q: Does the same thing work for constants?
               Yes, they are just initialized variables that don't change

            ... read x
                  (Suppose x is a variable at lexical address (lvls, ofst)

                [code to load x's scope's frame pointer into $r3]
                [RCH $r3, ofst]

            ... print E
                Q: Should we write a character with code E or the digits of E?
                    Probably want both, and we added a
                         PINT instruction to VM to print integer value
                            (and PFLT to print a float value in FSRM)
                    Should also add a new kind of statement, say writeChar E
                         to SPL (and FLOAT) to write a single character

                    For now we will leave PINT as writing an integer value,
                     We want the result of E to be on top of the stack
                     So we would use a design like:

                   [code to eval E onto top of the stack]
                   [allocate 1 word to hold the result of PINT]
                   [PINT $sp, 0]
                   [deallocate the 2 allocated words]

            Another basic statement is
                 call p
               (but for hw4 we aren't handling that,
                and in any case we would worry about procedures later)

** Conditions
*** Overall conditions

    Conditions are somewhat like expressions,
      and can contain expressions,
      so can't reserve registers for their values
    Instead, like expressions, they should always store
      their (truth) value
      on top of the runtime stack

    How should we represent truth values?
      I'll use 1 for true and 0 for false
      
------------------------------------------
        GRAMMAR FOR CONDITIONS

<condition> ::= divisible <expr> by <expr>
              | <expr> <rel-op> <expr>
<rel-op> ::= == | != | < | <= | > | >=

So the recursion structure of the code is?


Code looks like:


------------------------------------------
          // return a code sequence to put the truth value
          // (1 for true, 0 for false) of the condition on top of stack
          gen_code_condition(condition_t cond)
             switch (cond.cond_kind)
              and call either
                  gen_code_db_condition(cond.data.db_cond)
               or gen_code_rel_op_condition(cond.data.rel_op_cond)

         Q: What should these functions return?
            code sequences (code_seq)
            Write the code for gen_code_condition...

code_seq gen_code_condition(condition_t cond)
{
    switch (cond.cond_kind) {
    case ck_db:
	return gen_code_db_condition(cond.data.db_cond);
	break;
    case ck_rel:
	return gen_code_rel_op_condition(cond.data.rel_op_cond);
	break;
    default:
	bail_with_error(/* ... */);
	break;
    }
    // never happens, but suppresses a warning from gcc
    return code_seq_empty();
}
  
**** Relational operator conditions
------------------------------------------
     RELATIONAL OPERATOR CONDITIONS

<condition> ::= <expr> <rel-op> <expr>

A design for rel-op conditions:

 Goal: put true of false on top of stack
       for the value of the condition

 One case for each condition:

 Consider case op is !=

  [Evaluate E2 to top of stack]
  [Evaluate E1 to top of stack]
  # What does the stack look like? (1)
  # jump ahead 3 instrs,
  # if memory[GPR[$sp]]
  #              != memory[GPR[$sp]+1]
  BNE $sp, 1, 3
  # put 0 (false) at SP+1
  LIT $sp, 1, 0
  # jump over next instr
  JREL 2
  # put 1 (true) at SP+1
  # What does the stack look like (2)?
  # deallocate one word from stack
  ARI $sp, 1
  # now top of stack has truth value

 Consider E1 >= E2
  [Evaluate E2 to top of stack]
  [Evaluate E1 to top of stack]
  # What does the stack look like? (3)
  SUB $sp, 0, $sp, 1  # SP = E1 - E2
  # jump ahead 3 instrs, if geq
  BGEZ $sp, 1, 3      # skip 2 instrs
  # put 0 (false) at SP+1
  LIT $sp, 1, 0
  # jump over next instr
  JREL 2
  # put 1 (true) at SP+1
  LIT $sp, 1, 1
  # What does the stack look like (4)?
  # deallocate one word from stack
  ARI $sp, 1
  # now top of stack has truth value
------------------------------------------
          explain all of this.
             Note that E1 >= E2
                is true just when E1 - E2 >= 0
                    (subtract E2 from both sides)

          Q: What would work for ==?
               Use BEQ instead of BNE

          Q: What would you do for < ?
               use BLTZ instead of BGEZ,
               similarly for <= and >.

------------------------------------------
     CODE FOR BINARY RELOP CONDITIONS

// file ast.h
typedef struct {
    file_location *file_loc;
    AST_type type_tag;
    expr_t expr1;
    token_t rel_op;
    expr_t expr2;
} rel_op_condition_t;


// file gen_code.c

// Generate code for cond,
// putting its truth value
// on top of the runtime stack
// May also modify SP,HI,LO, and $r3
code_seq gen_code_rel_op_condition(
              rel_op_condition_t cond)
{


}

------------------------------------------
    ...

    [code to push E2's value on the stack]
    [code to push E1's value on the stack]
    [code to push the truth value of E1 rel_op E2
     on top of stack, as above]

    which can be generated as

    // push expr2's value on the stack
    code_seq ret = gen_code_expr(cond.expr2);
    // push expr1's value on the stack
    code_seq_concat(&ret, gen_code_expr(cond.expr1));
    // put the truth value of the relationship on top of the stack
    code_seq_concat(&ret, gen_code_rel_op(cond.rel_op));
    return ret;
    

** Control Flow Statements (Compound Statements)
     These are the compound statements
     Q: Why is it useful to write the base cases first?
        Testing is easier

------------------------------------------
 ABSTRACT SYNTAX FOR COMPOUND STATEMENTS

S ::= begin S*
    | if C S1* S2*
    | while C S*

So what is the code structure?

   based on the recursive structure of the ASTs
   (like the unparser)

Source and generated code look like:


  begin S1 S2 ... end


           [NOP]  # in case the list is empty
           [code for S1]
           [code for S2] # concat them all!


  if C S1*

           [code to push C's truth value on stack]
           code_seq thenstmts = gen_code_stmts(S1*);
           # skip around S1* if C is false
           # What does the stack look like? (1)
           [LIT $sp, -1, 0]
           [BEQ $sp, -1, length(thenstmts)+2]
           [code for S1*]
           [deallocate the condition's value]

  if C S1* S2*

          [code to push C's truth value on stack]
           code_seq thenstmts = gen_code_stmts(S1*);
           code_seq elsestmts;
           if (/* has else part */) {
                    elsestmts = [NOP]
           } else { elsestmts = gen_code_stmts(S2*); }
           # skip around S1* if C is false
           # What does the stack look like? (1)
           [LIT $sp, -1, 0]
           [BEQ $sp, -1, length(thenstmts)+2]
           [code for S1*]
           # skip else part (finish)
           [JREL length(elsestmts)+1]  
           [code for S2*, elsestmts]
           [deallocate the condition's value]


  while C S

 cond:     [code to push C's truth value on top of stack]
           [LIT $sp, -1, 0]
           [BEQ $sp, -1, [length(S*)+1]] # goto exitLoop if false
           [deallocate the truth value from the stack (ARI, $sp, 1)]
           [code for S*]

           JREL -(length(S*)+length(C)+3) #jump back (goto cond)
 exitLoop: [deallocate the truth value from the stack (ARI, $sp, 1)]


------------------------------------------
       ... # begin S1 S2 ...
           [NOP]  # in case the list is empty
           [code for S1]
           [code for S2] # concat them all!
           ...

       ... if C S1* S2*
           [code to push C's truth value on stack]
           code_seq thenstmts = gen_code_stmts(S1*);
           code_seq elsestmts;
           if (/* has else part */) {
                    elsestmts = [NOP]
           } else { elsestmts = gen_code_stmts(S2*); }
           # skip around S1* if C is false
           # What does the stack look like? (1)
           [LIT $sp, -1, 0]
           [BEQ $sp, -1, length(thenstmts)+2]
           [code for S1*]
           # skip else part (finish)
           [JREL length(elsestmts)+1]  
           [code for S2*, elsestmts]
           [deallocate the condition's value]

           Why add 2 to length of S1*?
              to jump past the instruction that
                   skips over the else part

           Note this requires the computation of the code
             sequences for S1* and S2* first,
             so the compiler "knows" how long they are

       ... # while C do S*
 cond:     [code to push C's truth value on top of stack]
           [LIT $sp, -1, 0]
           [BEQ $sp, -1, [length(S*)+1]] # goto exitLoop if false
           [deallocate the truth value from the stack (ARI, $sp, 1)]
           [code for S*]

           JREL -(length(S*)+length(C)+3) #jump back (goto cond)
 exitLoop: [deallocate the truth value from the stack (ARI, $sp, 1)]

           Q: Why deallocate the truth value in the loop and at the end?
              (since the condition allocates a word,
               the stack would keep growing if not deallocated each time,
               and when the loop exits that word also needs to be deallocated)