HOW A VM WORKS runtime inputs | v [--------] [------------] | BOF | --> | VM | -> trace [--------] [------------] output | v runtime outputs KINDS OF INSTRUCTION SET ARCHITECTURES CISC = Complex Instruction Set Computer (x86) - hundreds of instructions - instructions do several things - hard to manufacture to be fast - easier to program in assembler - compilers need to pick best instruction RISC = Reduced Instruction Set Computer (MIPS) - fewer instructions - each instruction does a simple job - easy to manufacture to be fast - harder to program in assembly language - compilers must do a lot ISA = INSTRUCTION SET ARCHITECTURE Accumulator-based (Montagne's Tiny VM): Instructions work on accumulator and memory SUB addr means accum <- accum - memory[addr] Register-based (MIPS): Computation in the registers SUB $r1,$r2,$r3 means GPR[$r1] <- GPR[$r2] - GPR[$r3] SSM MEMORY AND REGISTERS addrs memory GPRs Special 32768 [---------------] | ... | { [ ] <- FP AR{ | ... | { [ ] <- SP |vvvvvvvvvvvvvvv| | ... ] | | |---------------| global | | data [ ] <- GP | ... | | | instrs [ ] <----- PC | | 0 [---------------] HI LO MACHINE CYCLE 1. Fetch instruction at PC 2. Increment PC (PC <- PC + 1) 3. Execute that instruction ISA OF THE SSM Words are 32 bits (ints) Word-addressible: LWR t,s,o is GPR[t] <- memory[GPR[s]+o] Dedicated registers: Assembly Num Role Name ============================= 0 globals pointer $gp 1 stack pointer $sp 2 frame pointer $fp ... $3-$6 7 return address $ra Stack grows downwards, towards lower addresses Location-based instructions: ADD t,ot,s,os means memory[GPR[t]+ot] <- memory[GPR[SP]] + memory[GPR[s] + GPR[os]] # add the top of the stack to itself ADD $sp,0,$sp,0 ADDI r,o,i means memory[GPR[r]+o] <- memory[GPR[r] + o] + i SPECIAL REGISTERS Special registers: PC program counter HI high bits of multiplication LO low bits of multiplication PC is changed by: - jump instructions - branch instructions - CALL, CSI, and RTN instructions GOALS FOR SUBROUTINES subroutine = function or procedure (abstraction of expressions or commands) Want: WHAT A SUBROUTINE CALL DOES What are the steps in a subroutine call like: x = f(E1,E2) where E1 and E2 are expressions? SUBROUTINE CALLS Call done by "CALL" or CSI instruction CALL addr means GPR[$ra] <- PC PC <- addr RTN means PC <- GPR[$ra] These instructions don't do anything else no saving or restoring of other registers CALLING CONVENTION Agreement between all callers and callees Callee never changes the GP register Caller saves all other registers needed after the call Caller restores those registers, if any, upon return Arguments (values) are pushed on the stack Result is returned on the top of the stack IT'S A CONVENTION / AGREEMENT Saving and restoring not enforced by the hardware VM FEATURES TO SUPPORT SUBROUTINES - call and rtn instructions the $ra is used by them - for recursive subroutines - need local storage for each call on the runtime stack ($sp register) - manage storage for activation records (ARs) on the runtime stack - support for static scoping by convention More modern features (not in this class): - exception handling - passing and returning subroutines as values - objects and inheritance - multi-threading VM DESIGN FOR SUBROUTINES For each call: - storage for a subroutine's variables (local storage) organized as a stack of ARs - storage for a single call is called an AR: def: An *activation record* (AR) a portion of the runtime stack that holds local storage for one call of a subroutine RUNTIME STACK ORGANIZATION In the code: P calls Q, Q calls R # code in PL/0 procedure P; call Q; procedure Q; call R; end; end; end. Initially: [ 0 ] Call of subroutine P: [ AR for P ] After P calls Q: [ AR for P ] [ AR for Q ] After Q calls R: [ AR for P ] [ AR for Q ] [ AR for R ] After R returns: [ AR for P ] [ AR for Q ] After Q returns: [ AR for P ] After P returns: [ ] STACK IMPLEMENTATION AR delimited by two indexes: - fp: frame pointer, holds address of the base of the AR - sp: stack pointer, holds address of the top of the AR Notes, assuming stack is word-addressed, and grows towards lower addresses (downwards) STACK HEADER FILE #ifndef STACK_H #define STACK_H #include #include "machine_types.h" // size of the stack in words #define MAX_STACK_HEIGHT 4096 // Initialize the stack data structure // This should be called first extern void stack_initialize(); // the stack's invariant extern void stack_okay(); // Return the stack's total size extern uword_type stack_size(); // Return the current AR's size extern uword_type stack_AR_size(); // Return the address of the base // of the current AR (FP value) extern address_type stack_AR_base(); // Return the address of the top word // element in the current AR // (i.e., return the value of SP) extern address_type stack_top_AR_address(); // I deleted this function, as it was poorly named // Is the stack empty? (has one word allocated) extern bool stack_empty(); // Is the stack full? extern bool stack_full(); // Requires: there is space for the given // number of words // Allocate the given number of words extern void stack_allocate(unsigned int words); // Requires: !stack_full() // Push the word val onto the stack extern void stack_push(word_type val); // Requires: stack_size() >= words // Decrease the size of the stack by // the given number of words extern void stack_deallocate(unsigned int words); // Requires: !stack_empty() // pop the stack and return the top word. extern word_type stack_pop(); // Requires: !stack_empty() // Return the top word's value // without popping extern word_type stack_peek(); // Requires: 1 <= words // Requires: there is space for allocating // the given number of words // Start a new AR, allocating // the given number of words, // putting (the old) FP // in the first allocated word, // and setting FP to be the old SP-1. extern void start_AR(unsigned int words); // Requires: sp+words <= memory[fp] // the given number of words // Finish the AR, by setting FP to // memory[FP], and deallocating // the given number of words. extern void finish_AR(unsigned int words); #endif STACK CODE IN C #include #include #include #include "utilities.h" #include "stack.h" // the stack's storage word_type memory[MAX_STACK_HEIGHT]; // stacks will grow downwards #define ORIGINAL_FP (MAX_STACK_HEIGHT - 1) // first index of current AR static int fp; // index of top element of current AR static int sp; // Initialize the stack data structure void stack_initialize() { } // the stack's invariant // this is for stacks that grow downwards! void stack_okay() { } // Return the stack's total size uword_type stack_size() { } // Return the current AR's num. uword_type stack_AR_size() { } // Return the address of the base // of the current AR (FP value) address_type stack_AR_base() { return fp; } // Return the address of the top word // element in the current AR // (i.e., return the value of SP) address_type stack_top_AR_address() { return sp; } // Is the stack empty? bool stack_empty() { return stack_size() <= 0; } // Is the stack full? bool stack_full() { return stack_size() >= MAX_STACK_HEIGHT; } // Requires: there is space for the given // number of words // Allocate the given number of words void stack_allocate(unsigned int words) { } // Requires: !stack_full() // Push the word val onto the stack void stack_push_word(word_type val) { } // Requires: stack_size() >= words // Decrease the size of the stack by // the given number of words void stack_deallocate(unsigned int words) { assert((sp+words) <= fp); sp += fp; stack_okay(); } // Requires: !stack_empty() // pop the stack and return the top word. word_type stack_pop() { } // Requires: !stack_empty() // Return the top word's value // without popping word_type stack_peek() { stack_okay(); return memory[sp]; } // Requires: 1 <= words // Requires: there is space for allocating // the given number of words // Start a new AR, allocating // the given number of words, // putting (the old) FP // in the first allocated word, // and setting FP to be the old SP-1. void start_AR(unsigned int words) { } // Requires: sp+words <= memory[fp] // the given number of words // Finish the AR, by setting FP to // memory[FP], and deallocating // the given number of words. void finish_AR(unsigned int words) { } STACKS AND SUBROUTINES IN THE VM Calling convention: - push arguments on the stack - return result on top of stack STACK IMPLEMENTATION IN SSM ASSEMBLER # The stack is word addressed # The stack grows down, # towards lower addresses # The virtual machine puts address, # from BOF, say STKB, in $sp and $fp. # Assume that STKB is greater than # value of $gp. addrs memory GPRs 32768 [---------------] | ... | { [ old FP ] <- FP AR{ | ... | { [ ] <- SP |vvvvvvvvvvvvvvv| | ... ] | | |---------------| global | | data [ ] <- GP | ... | | | instrs [ ] <----- PC | | 0 [---------------] INITIALIZATION FOR THE DEMO # file stack_demo.asm .text main # start at "main" # put initial stack size # in memory[$gp] stack_initialize: SWR $gp,0, $fp RTN # ... .data 1024 WORD InitFP # ... more data here ... .stack 4096 .end INVARIANT CHECK stack_okay: SRI $sp, 2 # allocates 2 words # check that 0 <= $sp SWR $sp, 1, $sp # save the value of SP BGEZ $sp, 1, +2 JMPA failSP # check that $sp <= $fp, i.e., 0 <= $fp - $sp SWR $sp,0, $fp SUB $sp,1, $sp,1 BGEZ $sp, 1, +2 JMPA failSPFP # check that $fp <= InitFP, i.e., $fp - InitFP <= 0 CPW $sp,0, $sp,1 SUB $sp,1, $gp,0 BLEZ $sp,1, +2 JMPA failFP ARI $sp, 2 # restore SP to its old value RTN # invariant passed failSP: PSTR $gp, 1 EXIT 1 failSPFP: PSTR $gp, 8 EXIT 1 failFP: PSTR $gp, 14 EXIT 1 # ... more code could go here ... .data 1024 WORD InitFP STRING[7] SPError = "Error: SP must be >= 0!\n" STRING[6] SPFPError = "Error: not(SP <= FP)!\n" STRING[7] FPError = "Error: FP must be <= Initial FP!\n" STRING[2] Wrong = "Wrong!\n" STRING[3] Passed = "Passed!\n" MORE STACK FUNCTIONS IN ASSEMBLER # allocate a word stack_allocate: SRI $sp, 1 RTN # deallocate a word stack_deallocate: ARI $sp, 1 RTN # return size of stack stack_size: SRI $sp, 3 CPW $sp, 0, $gp, 0 SWR $sp, 1, $sp SUB $sp, 2, $sp, 1 ADDI $sp, 2, 4 # 3 for the allocation and 1 for the arith. ARI $sp, 2 # restore the stack, but leave result RTN # return size of current AR (i.e., $fp - $sp + 1) stack_AR_size: SRI $sp, 3 SWR $sp,0, $fp SWR $sp,1, $sp SUB $sp,2, $sp,1 ADDI $sp,2, 1 # add 1 to that ARI $sp, 2 # deallocate 2 locations RTN # return current AR base (i.e., value of $fp) stack_AR_base: SRI $sp, 1 SWR $sp,0, $fp RTN # return address in $sp stack_top_AR_addr: SRI $sp, 1 SWR $sp,0, $sp RTN # is the stack full? stack_full: CALL stack_top_AR_address BLEZ $sp,0, +3 false: LIT $sp,0, 0 # put false in top of stack JREL +2 true: LIT $sp,0, 1 # put true in top of stack RTN TESTING IT $ make stack_demo.bof $ make stack_demo.myp $ make stack_demo.myo