SUPPORTING PROCEDURES AND CALLS

Main issues:
   - storing their code
     Why? - not executed until called


   - knowing exactly where each starts
     Why?  - because instruction needs address


Another issue:
   - sending the right static link
       to the procedure (in $a0)


       WHERE TO PUT PROCEDURE CODE?

Possible layouts in VM's code array:

1. [code for procedures]
   [code for main program]

2. [code for main program]
   [code for procedures]

Implementation idea:
   A. track starting address of each procedure
       (as an attribute)
   B. Procedure code is written out at end
       of code generation, so adjust addresses
       of call instructions then

Advantages of layout 1:
  + offsets of the procedures are their addresses
  + if there are no procedures, can still test

Disadvantage of layout 1:
  - if file format doesn't allow specification
    of what main program starts, need to put
    in a jump instruction to jump around procs

Disadvantage of layout 2:
  - need to adjust offsets of procedures
    by adding size of main program
    (but only known at end of compilation)


      NESTED PROCEDURES ARE A PROBLEM

  procedure A;
    procedure B;
      begin # B's body code...
            call A # ...
            # ...
      end
  begin
     # A's body code
     call B # ...
     # ...
  end

If lay out the code as

   [ code for A ]
   [ code for B ]

How do we know the address of B
    to compile the call to B?


What about the other direction?
   also a problem


   RECURSIVE PROCEDURES, SIMILAR PROBLEM

  procedure R;
    begin
      # R's body code ...
      call R
      # ...
    end

Before storing code for R,
  how do we know where it starts?

   Hard to do that...


        MUTUAL RECURSION
        
  procedure O;
    begin # O's body code...
      call E
      # ...
    end

  procedure E;
    begin
      # E's body code ...
      call O
      # ...

One of these must before the other in
  the code area of the VM...

   so the call address won't be known


       SOLUTION STRATEGIES FOR CALLS

[Multiple passes]:
  1. Generate code for each procedure
     (+ store offsets in symbol table,
      + layout procedure code in memory)
  2. Gather table of addresses
     (map from names to addresses,
      using offsets and beginning address)
  3. Patch up code addresses for calls
     (+ output code)

[Lazy evaluation, labels]:
  1. Generate code for each procedure
     with calls to labels
     (+ store or update
        labels in symbol table)
  (+ output code)

      GENERAL SOLUTION: MULTIPLE PASSES

Problem: where does each procedure start?

Solution idea:
  1. Compile all procedure code
     (now know how big each procedure is)
  2. Lay out procedure code in memory
     (now know where each starts)
  3. Change each call instruction


         GENERAL SOLUTION: LABELS

Use "labels" to allow
  the IR to specify a target (address)
  that is determined later

Term "label" is from assembly language

    ;  ...
          jmp L
    ; ...
    L: ; ...


        APPROACHES TO FIXING LABELS

Problem: convert labels to addresses

 (1) Use multiple passes
       a. Generate code with labels
       b. Lay out memory for procedures
          (determine starting addresses)
       c. Change labels to addresses

     advantages:
           easy to understand/coe
	   may be a bit slower

     disadvantages:
           time needed is linear in size of code

 (2) Use shared mutable data (lazy eval.)
       a. labels are unique placeholders,
          shared by all uses (calls)
       b. when address is determined,
          update the placeholder
          (and all uses are updated)

     advantages:
       + can debug some code early
          (when declarations before uses)
     disadvantages:
       - tricky to code: need to ensure labels
          are unique (never copies)
       - harder to understand
       - still need multiple passes for mutual
          recursion


    LABEL DATA STRUCTURE FOR LAZY EVAL

// file label.h
// ...
#include "machine_types.h"

typedef struct {
    bool is_set;
    unsigned int word_offset;
} label;

// Return a fresh label that is not set
extern label *label_create();

// Requires: lab != NULL
// Set the address in the label
extern void label_set(label *lab,
               unsigned int word_offset);

// Is the given label set?
extern bool label_is_set(label *lab);

// Requires: label_is_set(lab)
// Return the word offset in lab
extern
unsigned int label_read(label *lab);