INTEGRATIVE QUESTION How would you organize the code for an assembler? 1. tokenize the input, using a lexer 2. have a parser that builds ASTs 3. Build a symbol table for identifiers (e.g., labels, and/or for data) attributes: offset or address 4. translate statements into instructions mnemonics to op codes literals (numbers) into binary COMPILER CONCEPTS What is a scope? an area of a program's text where a declaration has effect What syntactic features start scopes in C? { and } What starts a scope in SPL? begin ... to ... end How is a symbol table used? to check: - no name is declared twice - every use of an name has a declaration each use has attributes (needed in code generation) Is there just one symbol table? in the implementation, one per potential scope for the entire program, varies conceptually just with varying mappings CODE GENERATION What did we use for an IR in HW4? ASTs with id_use pointers What information from a name's use is needed to generate code in SPL? lexical address = (levelsOutwards, offset) generated code follow levelsOutwards static links might need other attributes In SPL, why not have the parser determine the lexical address of each variable? we'd like the parser to just make the AST but it is static and the lexical address is computed during scope checking CODE GENERATION FOR STATEMENTS How is the code generator written in C? walk the ASTs What does the C code look like for generating code for SPL statements? recursive using the pattern of the AST grammar, like the unparser (and the scope checker) What does the C code look like for generating code for a SPL while statement? cond:[code to evaluate the condition and push that condition's value on the stack] [code to jump to loop body (body), if it's true] [jump past the loop body (to after), if it's false] body:[deallocate the condition (pop stack)] [code for the loop body stmts] [code to jump back to start of condition (to cond)] after:[deallocate the condition (pop stack)] GENERATING CODE FOR IF STATEMENTS In C there are 2 kinds of if-statements with syntax if (Exp) Stmt and if (Exp) Stmt1 else Stmt2 How would these be compiled? [code to evaluate the Exp] [code to jump to then part (then), if it's true] [code to jump to the else part, if it's false] then:[code for the then part] [code to jump around the else part (to after)] [code for the else part] after: [deallocate space for condition] What would the generated code look like for a C switch statement? make a branch table (an array of addresses) compute the offset, jump indirectly using some offset into the table ACTIVATION RECORDS Could a compiler allocate all named constants and variables at the outermost scope? Why or why not? No, becuase some variables or constants might have the same name. ACTIVATION RECORDS (ARs) What are ARs used for on the runtime stack? store the constant and variables in the AR and the saved registers Why not have all calls of a procedure share the same storage? (the same AR) scopes around a call may have different variable values, esp. in recursion In C, how many levels of static scopes can there be in a program? arbitrary number of them In C, what causes that nesting of scopes? blocks { ... } PROCEDURE CALLS What does the CALL instruction do? What static link should be saved by a called procedure? In C, when the function main returns, what is it returning to? PROCEDURE CALLS AND RETURNS How is the starting address of a procedure obtained (for a CALL instruction)? Why does the stack need to be trimmed before a procedure returns? With static scoping, what does the BP register point to on the stack? SCOPING AND ADDRESSING If some variables were dynamically scoped, how would they be addressed? GENERATING CODE FOR DECLARATIONS How would the C declaration const double e = 2.718281828459; be compiled for a stack machine? How would the C declaration int i; be compiled for a stack machine? How would the C declaration void incI() { i += 1; } be compiled? CODE GENERATION FOR EXPRESSIONS Where does the compiled code put the result of an expression (in a stack machine)? Would that be different for a register machine? Why are identifier uses important? CODE GENERATION FOR EXPRESSIONS 2 How is a constant's lexical address used to load the constant's value? How is a variable's lexical address used in an assignment statement? ASSEMBLY LANGUAGE What is an assembly language? What features does assembly language provide to help programmers? How does assembly language differ from C? ASSEMBLERS How does an assembler translate forward jumps to a label? How would you organize an assembler as a program? LINKERS What does a linker do? How many passes would a linker need? Does a linker treat user program code differently than library code? LOADERS What does a loader do? What is a boot loader? How is the boot loader loaded? Does a boot loader do relocation? OS What are the goals of an OS? What is the basic technique for running programs efficiently used by an OS? Would it be better if an OS were just a collection of libraries that worked with the hardware? PROCESSES What is a process? How is a process different from a program? How does a program become a process? Why is the process concept useful for programmers? PROCESS TRANSITIONS What states can a process be in? What causes transitions between states? RESOURCE SHARING How does an OS facilitate sharing of resources? Why are interrupts needed? What are the different kinds of interrupts? INTERRUPT PROCESSING What is an interrupt handler? What mode does an interrupt execute in? What does an interrupt handler do to save the state of the running process? TRAP TABLES (INTERRUPT VECTORS) What is a trap table? What software controls the trap table? LIMITED DIRECT EXECUTION What is limited direct execution? How does it work? What mode does your program run in? What kinds of instructions are privileged? SYSTEM CALLS What is a system call? Why are system calls needed? THREADS What is a thread? Does each thread have its own stack? Why are threads useful? SYNCHRONIZATION What is a race condition? Why are race conditions a problem? What is a critical section? CORRECTNESS FOR THREADS When is an execution serializable? When is execution atomic?