COP 3402 meeting -*- Outline -*- * The Tiny Machine ** Computer Organization, von Neumann architecture Q: What are the main parts of a computer CPU? Memory, ALU, PC, registers ------------------------------------------ VON NEUMANN ARCHITECTURE (HARDWARE) /------> [ PC ] | | | v |------> [ MAR ] | | ^ | v \---------------\ | [ MEMORY ] | | [ (RAM) ] <----------| | [ ] | | | | | v | IR [OP|ADDR] <-> [ MDR ] <--> [ ACCUM ]| || ^ | ^ | | | | \ | v v v \ | /---------\ \-----------------/ \| / DECODER \ \ ALU / || \-----------/ > \-------------/ || | / > | || v /--/ / v || [Control] / \------------/| [Unit ] -----/ | \----------------------------/ ------------------------------------------ Q: What is a register? It's hardware that holds some information that is: - easily accessible by the rest of the machine and - typically (very) fast to read and write (in a VM it is modeled as a variable) Q: What does the PC do? It (is a register that) holds the address of the next instruction to be executed Q: What does an address mean? If the machine is byte-addressible, then it's the byte number If the machine is word-addressible (say words are 4 bytes), then it's the word number Q: What does an address in the PC mean? The address of the next instruction to fetch (depending on whether the machine is byte or word addressible) Q: What does the MAR do? The Memory Address Register holds the address to be read or written to in the memory Q: What is stored in the MEMORY? Both programs and data (this is von Neumann's idea) Q: Can the program be altered once it is in MEMORY? Yes, that is needed for linking/loading and is also a security problem! Q: What does the MDR do? The Memory Data Register stores data that is being sent to the memory or received from it. (This could be either code or data.) Q: What does the ALU do? The execution of arithmetic and logical instructions (e.g., ADD, SUB, AND) Q: What does the decoder do? It communicates with the Control unit to tell the ALU and other parts of the CPU what to do based on the op-code from the IR. (it translates op codes, which are bit patterns, into control signals on wires in the processor) Q: What is the ACCUM? It's an accumulator, a user-visible register. ** Tiny Machine ISA ISA = Instruction Set Architecture essentially it's the design of a computer CPU (or a VM) ------------------------------------------ WARNING! This is not a description of HW1's ISA! Different instructions! It's The "Tiny Machine", NOT for HW1 ------------------------------------------ The instructions we will discuss are NOT the same instructions as in HW1! *** Instruction Cycle (refer to the diagram above) ------------------------------------------ INSTRUCTION CYCLE 1. Fetch: IR <- MEMORY[PC] 2. Execute: ; Advance the PC ; decode and ; execute instruction in IR ------------------------------------------ ... (For step 1, need to) Copy contents of PC into MAR: MAR <- PC Then copy the appropriate address in memory into MDR: MDR -< MEMORY[MAR] Then copy the contents of MDR into IR: IR <- MDR These statements, like IR <- MDR, are in a simple hardware description language. There are formal tools that are compilers of HDLs, these can output instructions to build physical CPUs (lithography masks, etc.) For step 2 to advance the PC do PC <- PC+1 // assumes the machine is word addressible If the instructions have varying length or if the machine is addressed in units smaller than an instruction, e.g., bytes, then the PC has to be advanced by an appropriate amount, which may depend on the size of the instruction, as in the x86.) *** Execution of Instructions for the Tiny Machine After each execution, we start the fetch cycle again Q: How will we know if we have enough instructions? When the set of instructions is "Turing Complete" (i.e., can write a Turing machine simulation); As a practical matter, when we can use it to compile/run any program (written in a Turing complete language) **** Load ------------------------------------------ EXECUTING LOD ; put IR's ADDR field into MAR ; fetch location into MDR ; move MDR's contents into ACCUM ------------------------------------------ The formalism is a simple hardware description language ... MAR <- IR.ADDR ... MDR <- MEMORY[MAR] ... ACCUM <- MDR **** Store ------------------------------------------ EXECUTING STO ; put IR's ADDR field into MAR ; move ACCUM into MDR ; put MDR into MEMORY at address in MAR ------------------------------------------ The formalism is a simple hardware description language ... MAR <- IR.ADDR ... MDR <- ACCUM ... MEMORY[MAR] <- MDR **** Subtraction ------------------------------------------ EXECUTING SUB ; put IR's ADDR field into MAR ; fetch location into MDR ; Compute ACCUM - MDR ------------------------------------------ ... MAR <- IR.ADDR ... MDR <- MEMORY[MAR] ... ACCUM <- ACCUM - MDR Addition, multiplication, etc. would be similar, but subtraction shows the ordering better **** Halt ------------------------------------------ EXECUTING HLT ; Stop execution, program ends normally ------------------------------------------ ... STOP Q: Does HLT need any address (operands)? No **** Other instructions? Q: Are we missing anything to run programs? Yes, jumps (for loops) and conditional jumps (for stopping loops and if-then-else) and I/O ***** Jump ------------------------------------------ EXECUTING JMP ; put IR's ADDR field into PC ------------------------------------------ ... PC <- IR.ADDR ***** Conditional Jump and if-then-else ------------------------------------------ EXECUTING SKZ ; If ACCUM is 0 then skip next instr. ------------------------------------------ ... if (ACCUM == 0) PC <- PC + 1 Q: Does SKZ need any operands? No, data is in ACCUM already, address is implicit Q: What happens to the PC if the condition is false? It has already been advanced to the next instruction, so that executes normally. Q: How would SKZ be used to program an if-then-else? use JMP instructions to manage the flow of control... put 0 in ACCUM if the condition is true if (COND) then S1 else S2 S3 ==> SKZ JMP
JMP
:
: Q: What if there is no else part? Then just jump to S3 instead of S2 Q: What if the condition is supposed to be zero? Can negate the test using logical not, i.e., NOT SKZ JMP
... ***** Loops Q: Will this be enough to do while loops? Yes, ... ------------------------------------------ TRANSLATING WHILE LOOPS while (COND) { S } S2 is translated into: ------------------------------------------ ... : SKZ JMP
JMP
: Q: Do we need to evaluate the condition (COND) each time? Yes, otherwise it loops forever! Q: What about for loops? Yes, those can be first translated into while loops, the same for do-while loops Q: What about a break statement? Translated as a JMP instruction (past end of loop) Q: What about continue statements? Also translated as a JMP, to start of loop ***** Input/Output Q: What kind of data should be input? Not integers, as there is no good way to indicate EOF or error Characters are small and suitable, use -1 for EOF as in getc() ------------------------------------------ EXECUTING CIN ; read a single char from input ------------------------------------------ ... ACCUM <- getc(stdin) so result is -1 if EOF or error Q: How can a program test for EOF or error? Need comparison operators, or write them using subtraction and literals How do we get a literal into the program? Put them at some standard address (e.g, 0 in address x200, ...) Q: Could we use the address field for something? Yes, it could indicate the device to read from (screen, disk, ...) ------------------------------------------ EXECUTING COU ; write ACCUM to output as a character ------------------------------------------ putchar(ACCUM) ; outputs the char in ACCUM Q: Could we use the address field for something? Yes, again could indicate the device to write to (screen, disk, ...) ***** Logical operations ------------------------------------------ EXECUTING OR ; put IR's ADDR field into MAR ; fetch location into MDR ; Compute ACCUM | MDR ------------------------------------------ ... MAR <- IR.ADDR ... MDR <- MEMORY[MAR] ... ACCUM <- ACCUM | MDR NOT and AND would be similar Note that NOT just needs to work on the accumulator itself ACCUM <- ! ACCUM **** Condition codes ------------------------------------------ REFLECTING CONDITIONS IN HARDWARE Use a register to indicate value in ACCUM Z is 1 when ACCUM is 0 G is 1 when ACCUM is positive L is 1 when ACCUM is negative x86 architecture has a program status word containing: Interrupt flags Supervisory mode flag condition codes ------------------------------------------ Q: If we have these condition codes, how to efficiently test them? Add new instructions: SKIPG and SKIPL *** Summary of Tiny Machine ISA ------------------------------------------ SUMMARY OF TINY MACHINE ISA OP CODE MNEMONIC ADDR? 1 LOD Y 2 STO Y 3 ADD Y 4 SUB Y 5 CIN ? 6 COU ? 7 HLT 8 JMP Y 9 SKZ 10 SKG 11 SKL 12 OR Y 13 AND Y 14 NOT FOR YOU TO DO Write a program in machine code to: - input two chars - print Y if they are the same - print N otherwise ------------------------------------------ (all numbers in decimal, but really we should be using bits...) Where should we put the program's data (e.g., the character? Let's say location 103 holds the first character read, location 89 holds the ASCII code for 'Y' (89), location 78 holds the ASCII code for 'N' (78), location 100 holds 10 (\n) 0 89 ; 'Y' in ASCII 2 89 ; store Y in location 89 0 78 ; 'N' in ASCII 2 78 ; store N in location 78 0 10 ; '\n' 2 100 ; store \n in location 100 5 0 ; CIN 2 103 ; STO 103 5 0 ; CIN 4 103 ; SUB 103 9 0 ; SKZ 8 17 ; JMP n 1 89 ; LOD 89 (Y) 6 0 ; COU 1 100 ; LOD 100 (\n) 6 0 ; COU 7 0 ; HLT 1 78 ; LOD 89 (N) (label n = address 17) 6 0 ; COU 1 100 ; LOD 100 (\n) 6 0 ; COU 7 0 ; HLT Q: What's hard about this? Tracking op codes, locations for jumps, putting values in the right locations in memory Q: What do we need to assume (some values in locations for chars)? To run this test, use echo to put 2 characters in stdin (typing them at the terminal is problematic) e.g.: echo aa | ./vm ../../lectures/systems-software/tm-test-read-2-chars.txt ** Assembly language Q: So, why do we need assembly language? To automate all the difficult bookkeeping we have for machine code: op codes, locations for jumps, memory initialization ------------------------------------------ ASSEMBLY LANGUAGE FEATURES - Mnemonics for opcodes - Names for locations - Initialization of data - Comments ------------------------------------------ ... both for data and jumps *** Our example in assembly language ------------------------------------------ EXAMPLE IN ASSEMBLY LANGUAGE .begin ; text (code) section ; read a char into c1 start: CIN STO c1 CIN ; read char into ACCUM SUB c1 SKZ JMP n ; jump to n if different LOD yc ; output "Y\n" COU LOD nl COU HLT n: LOD nc ; output "N\n" COU LOD nl COU HLT .end start ; data section .data c1 1 0 .data yc 1 89 .data nc 1 78 .data nl 1 10 ------------------------------------------ Q: What does an assembler need to do? - translate mnemonic opcodes into numeric ones (bit patterns) - track names for locations and translate those names into location numbers when they are used (either in jumps or in loads/stores) - arrange for initialization of named data locations Q: What kind of data structures would help with those tasks? Hash tables for mapping names to locations Q: What kind of guarantees should an assembler make? - That locations used for data are not overwritten with code others? At least it initially stores data away from the program ------------------------------------------ ASSEMBLY LANGUAGE DIRECTIVES Help structure the program: ------------------------------------------ ... .begin ; start of code section .end ; end of a section .data ; data section directives ... ** tracing (as in a debugger) Q: Should we trace how that executes? (I'm assuming that instructions are all 1 word long and PC addresses words) ------------------------------------------ DEBUGGER TRACE Addr OP ADDR 0 LIT 89 1 STO 89 2 LIT 78 3 STO 78 4 LIT 10 5 STO 100 6 CIN 0 7 STO 103 8 CIN 0 9 SUB 103 10 SKZ 0 11 JMP 17 12 LOD 89 13 COU 0 14 LOD 100 15 COU 0 16 HLT 0 17 LOD 78 18 COU 0 19 LOD 100 20 COU 0 21 HLT 0 Tracing ... PC: 0 ACCUM: 0 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 100: 0 ... ==> addr: 0 LIT 89 PC: 1 ACCUM: 89 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 100: 0 ... ==> addr: 1 STO 89 PC: 2 ACCUM: 89 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 89: 0x59 90: 0x0 ... 100: 0 ... ==> addr: 2 LIT 78 PC: 3 ACCUM: 78 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 89: 0x59 90: 0x0 ... 100: 0 ... ==> addr: 3 STO 78 PC: 4 ACCUM: 78 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 0 ... ==> addr: 4 LIT 10 PC: 5 ACCUM: 10 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 0 ... ==> addr: 5 STO 100 PC: 6 ACCUM: 10 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... ==> addr: 6 CIN 0 PC: 7 ACCUM: 97 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... ==> addr: 7 STO 103 PC: 8 ACCUM: 97 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 8 CIN 0 PC: 9 ACCUM: 97 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 9 SUB 103 PC: 10 ACCUM: 0 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 10 SKZ 0 PC: 12 ACCUM: 0 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 12 LOD 89 PC: 13 ACCUM: 89 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 13 COU 0 YPC: 14 ACCUM: 89 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 14 LOD 100 PC: 15 ACCUM: 10 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 15 COU 0 PC: 16 ACCUM: 10 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ==> addr: 16 HLT 0 PC: 17 ACCUM: 10 memory: 0: 0x59 1: 0x2000059 2: 0x4e 3: 0x200004e 4: 0xa 5: 0x2000064 6: 0x5000000 7: 0x2000067 8: 0x5000000 9: 0x4000067 10: 0x9000000 11: 0x8000011 12: 0x1000059 13: 0x6000000 14: 0x1000064 15: 0x6000000 16: 0x7000000 17: 0x100004e 18: 0x6000000 19: 0x1000064 20: 0x6000000 21: 0x7000000 22: 0x0 ... 78: 0x4e 79: 0x0 ... 89: 0x59 90: 0x0 ... 100: 10 101: 0 ... 103: 97 104: 0 ... ------------------------------------------ Note we start the program at instruction (word) 0 Here we assume we input the character 'a' both times What happens if they are different? ** Register Machine with Load/Store Architecture Q: Is the Tiny Machine Turing complete? Yes, we believe so... Q: Is the tiny machine easy to program? No... Q: Would the tiny machine be efficient? No... Effeciency of programming and execution are different from having enough instructions to be Turing complete. We first turn to efficiency of execution... *** memory hierarchy ------------------------------------------ BACKGROUND: SPEED OF COMPUTER MEMORY Speed Cost/bit ======= ========= Fastest Expensive Registers Cache Memory (SRAM) RAM (Main Memory, DRAM) Spinning Magnetic Disk (HDD) External Drives (Tape, Optical) Slowest Cheapest ------------------------------------------ The most expensive storage has the smallest capacity (because one can't afford more of it) There are often several levels and types of cache memory (for instructions and data) Below from https://www.extremetech.com/extreme/ 188776-how-l1-and-l2-cpu-caches-work-and-why-theyre-an- essential-part-of-modern-chips See also: https://www.intel.com/content/www/us/en/ developer/articles/technical/memory-performance-in-a-nutshell.html register latency: 1 clock cycle (at ~ 3 GHz so ~ .3 nanosec) L1 cache latency: 3-4 clock cycles ~ 1 nanosec L2 cache latency: ~ 10 clock cycles ~ 3 nanosec L3 cache latency: ~ 90 clock cycles ~ 30 nanosec ~ 0.03 microsec L4 cache ~ 40 GB/s DRAM ~ 10 GB/s HDDs ~ .2 GB/s (2000 MB/s) Q: Do programs need to move data from main memory to caches? No, the hardware takes responsibility for that Q: Do programs need to move data from memory to registers? Yes, that is part of the ISA Also programs need to move data from other storage to main memory Q: Would programs execute faster if we used faster memory more? Yes, perhaps greatly... which leads us to the *** Register Machine **** Basics of the Architecture (this is more like homework 1, and based on the MIPS processor) ------------------------------------------ REGISTER MACHINE ARCHITECTURE - Word-oriented (4 bytes) but byte-addressed - All instructions 32 bits long must be aligned on word boundary - 32 registers ------------------------------------------ **** Instructions (ISA) ------------------------------------------ REGISTER MACHINE ISA Arithmetic and logic instructions - work with 3 registers - format: [op:6|rs:5|rt:5|rd:5|shift:5|func:6] - example: ADD s t d means GPR[d] <- GPR[s] + GPR[d] Intermediate operand instructions - work with 2 registers - format: [ op:6 | rs:5 | rt:5 | immed: 16 ] - example: ADDI s t i means GPR[t] <- GPR[s] + sgnExt(i) where sgnExt(i) i sign extended Jump instructions - work with a 26 bit address - format: [ op:6 | addr: 26 ] - example: JMP a means PC <- formAddress(a) System call instructions - work with a code and function - format: [ op:6 | code:20 | func:6 ] - example SYS 10 means stop the program's execution where sgnExt(0xFFFF) = 0xFFFFFFFF sgnExt(0x0000) = 0x00000000 sgnExt(0x0001) = 0x00000001 formAddress concatenates high bits of PC with the 26 bit address + 2 bits of 0 (to align on a word) so if PC = 0xFACADE, then formAddress(0x00DECADE) = 0x007B2B78 ------------------------------------------ *** Other kinds of ISAs Q: What other options are there for designing an ISA? Stack-machine, memory treated as a single stack with push/pop ADD will add the top two elements on the stack Q: What features of normal programming are hard with these ISAs? Subroutines! (but MIPS supports them with a kind of jump instruction) They should be well supported (to make them common) Why are subroutines a good thing? Support libraries, where experts can write reusable code