SUPPORTING PROCEDURES AND CALLS Main issues: - storing their code Why? - not executed until called - knowing exactly where each starts Why? - because instruction needs address Another issue: - sending the right static link to the procedure (in $a0) WHERE TO PUT PROCEDURE CODE? Possible layouts in VM's code array: 1. [code for procedures] [code for main program] 2. [code for main program] [code for procedures] Implementation idea: A. track starting address of each procedure (as an attribute) B. Procedure code is written out at end of code generation, so adjust addresses of call instructions then Advantages of layout 1: + offsets of the procedures are their addresses + if there are no procedures, can still test Disadvantage of layout 1: - if file format doesn't allow specification of what main program starts, need to put in a jump instruction to jump around procs Disadvantage of layout 2: - need to adjust offsets of procedures by adding size of main program (but only known at end of compilation) NESTED PROCEDURES ARE A PROBLEM procedure A; procedure B; begin # B's body code... call A # ... # ... end begin # A's body code call B # ... # ... end If lay out the code as [ code for A ] [ code for B ] How do we know the address of B to compile the call to B? What about the other direction? also a problem RECURSIVE PROCEDURES, SIMILAR PROBLEM procedure R; begin # R's body code ... call R # ... end Before storing code for R, how do we know where it starts? Hard to do that... MUTUAL RECURSION procedure O; begin # O's body code... call E # ... end procedure E; begin # E's body code ... call O # ... One of these must before the other in the code area of the VM... so the call address won't be known SOLUTION STRATEGIES FOR CALLS [Multiple passes]: 1. Generate code for each procedure (+ store offsets in symbol table, + layout procedure code in memory) 2. Gather table of addresses (map from names to addresses, using offsets and beginning address) 3. Patch up code addresses for calls (+ output code) [Lazy evaluation, labels]: 1. Generate code for each procedure with calls to labels (+ store or update labels in symbol table) (+ output code) GENERAL SOLUTION: MULTIPLE PASSES Problem: where does each procedure start? Solution idea: 1. Compile all procedure code (now know how big each procedure is) 2. Lay out procedure code in memory (now know where each starts) 3. Change each call instruction GENERAL SOLUTION: LABELS Use "labels" to allow the IR to specify a target (address) that is determined later Term "label" is from assembly language ; ... jmp L ; ... L: ; ... APPROACHES TO FIXING LABELS Problem: convert labels to addresses (1) Use multiple passes a. Generate code with labels b. Lay out memory for procedures (determine starting addresses) c. Change labels to addresses advantages: easy to understand/coe may be a bit slower disadvantages: time needed is linear in size of code (2) Use shared mutable data (lazy eval.) a. labels are unique placeholders, shared by all uses (calls) b. when address is determined, update the placeholder (and all uses are updated) advantages: + can debug some code early (when declarations before uses) disadvantages: - tricky to code: need to ensure labels are unique (never copies) - harder to understand - still need multiple passes for mutual recursion LABEL DATA STRUCTURE FOR LAZY EVAL // file label.h // ... #include "machine_types.h" typedef struct { bool is_set; unsigned int word_offset; } label; // Return a fresh label that is not set extern label *label_create(); // Requires: lab != NULL // Set the address in the label extern void label_set(label *lab, unsigned int word_offset); // Is the given label set? extern bool label_is_set(label *lab); // Requires: label_is_set(lab) // Return the word offset in lab extern unsigned int label_read(label *lab);