Announcements
- NEW Dec 12 Grades
of Project1, Project2 and Project3
- Nov 30 Grades of Project1 and
Project2
- Nov 29 Cache implementation project
posted. Due 11:59 pm of Dec. 1st. You can select to do previous
posted Project 3 or this one.
- Submission: Please compress your project and send it to droygao@gmail.com
with a README file.
- Some hints from CDA4150 2004:
- 10/30/04 - These words of Lab 3 debugging advice come
from your classmate Carlos Sierra:
Executing simv, a Ctrl-c interrupts the simulator. Type
help or "?" for a list of commands. Use print
or trace commands to debug. Print will let you print any
signal value (syntax requires module_name.module_name.module_name.signal,
just like virsims). Trace will give a list of lines where
the code is looping. Read those lines and follow the loop
in your code. Add in your code some $DISPLAY in suspicious
signals.
10/28/04 - Lab 3 Hints:
You don't have to, but I would suggest doing the I$ accesses
first and don't worry about the D$ at all until the I$
works. The boot code has about 20 instructions before
you even get to the first load or store (a store) so you
will have ample opportunity to test the hits and misses
to the Icache before you even need to worry about the
data cache. Once you have the CPU properly missing, fetching
8 instructions, hitting for 7 more instructions, missing,
etc...you will have completed a significant portion of
the lab. The data cache code is strikingly similar to
the instruction cache code, with the added complication
of writebacks to main memory.
You should read the caches combinationally, and you can
assume that like other combinational logic in our designs,
the access can happen in "zero time". For example,
the I$ is addressed by Iaddr. Whenever Iaddr changes,
your code may re-read the tag array of the I$ and assign
the result to a variable called iTag. This will happen
with 0-delay in our behavioral simulator. This allows
the I$ access to happen after the negedge of the clock
and still allow Iin to be flopped with the result (on
a hit) at the next posedge clock.
You should not use any delays in your code other than
the standard `TICK we have been using with flops. Aside
from having no physical meaning, your design wouldn't
work if I changed the number of ticks in a clock cycle.
You do not wait a certain amount of time by adding random
delays in your logic. You build a finite state machine
(FSM) that drives the appropriate outputs when it sees
the appropriate inputs. Your submitted design should work
if I change the MEM_DELAY or change the number of Verilog
ticks in the CLK cycle time. It will automatically work
as long as you never use any delays in your logic aside
from `TICK for the flop delay.
As some of you have noticed, the Valid bits are initialized
to 0 but the Dirty bits are not. Consequently, on the
first miss to a line, your miss logic might find the existing
line to be Dirty but not Valid and mistakenly try to write
the line back to main memory. Be sure that your cache
controller only writes lines back that are Dirty and Valid.
Be sure that you are not combinationally reading the cache
all the time. You don't want to read during a miss (the
cache has only one port and part of the miss time you
will be writing to it). So you should read the tag and
state from the cache combinationally only if you are not
stalled or if it is the retry cycle.
mips.v expects that Addr is driven during the entire fill
cycle for I$ and D$ fills.
- Nov 22 slides of matrix multiplication
using systolic arrays.
- Nov 19 Project3 posted. Due
Dec. 1st.
- Nov 16 homework
posted. Due Nov. 22th 11:59 p.m.
- Nov 15 The deadline of project2 is extended to 23:59pm
of Nov 16th. If you submit after the deadline but before
23:59pm of Nov 17th, you will get 50% late penalty. No submission
after 23:59pm of Nov 17th will be accepted.
- Nov 11 Because the monroe server may be down Friday and Saturday
for a replacement power supply installation, the deadline of project2
is extended to 23:59pm of Nov 15th.
If you submit after the deadline but before 23:59pm of Nov 16th,
you will get 50% late penalty. No submission after 23:59pm of Nov
16th will be accepted.
- Nov 9 For those of you got "Missing README file" error,
please change the access permission of your project with these commands:
- cd ~/
- chmod -R 755 cda4150
- Then submit your project
- After submission, use "chmod -R 700 cda4150" to
remove readable permission of your project. Otherwise, others
can read your project and copy it!
- Nov 7 Submission of Project2
- First, the deadline has been
postponed to 23:59pm of Nov 12th.
If you submit after the deadline but before 23:59pm of Nov 13th,
you will get 50% late penalty. No submission after 23:59pm of
Nov 13th will be accepted.
- Please run "make clobber"
first. Then run "~hgao/cda4150/submit4150
lab2" to submit your project. If you have any problem
to submit it, please compress it and send the compressed file
to "droygao@gmail.com" before
the deadline.
- Nov 6 IMPORTANT INFORMATION
for Project2
- Please copy the new mips.v from ~hgao/cda4150 to your lab2
directory. Depending on subtle timing in your Verilog code it
turns out there was still a race condition that could generate
spurious writes to the memory system. I changed the memory system
to only examine the write signals halfway through the cycle,
to give time for everything to settle to their stable values.
For those of you who thought you had everything working but
your simulations were just looping forever, please grab this
file and give it a try. It will likely fix your problems.
- Hints
- The signal inDelaySlot should be entirely removed from
the design. The delay slot in lab2 falls out of the pipeline
structure, and unlike lab1 we do not need to do anything
special to create one. In addition the signals savedBranchPC
and instIsBranch should also be removed.
- savedBranchPC, inDelaySlot, and instIsBranch all go the
way of the State always block in the pipelined processor.
Bye Bye.
- If you get an "Illegal Instruction" error, your
PC fetch logic is most likely messed up. Your execution
actually starts in boot.o (you can disassemble it) and then
jumps to your main program. If you do not see the first
instructions in boot.o as the instructions you are executing,
there are problems with your fetch logic. Check to make
sure PC is negedge flopped. Be sure that PC1, PC2, PC3,
and PC4 are posedge flopped. Also be sure that PC1, PC2,
IR1, IR2, and PC do not change when decodeStall is high.
- Make sure PC chooses 1 of 4 possible values based on PCsel.
- Make sure you are fetching from the PC. Iaddr is the instruction
address sent to the memory system. Make sure you are incrementing
PC as well.
- Branches and jumps are resolved in EX. Therefore PC calculations
involving branches or jumps should use the PC of the instruction
in the EX stage (PC2) in their calculations.
- Nov 2 Please check these slides from Prof. Heinrich to know what
is the block diagram mentioned inproject2.
- Nov 2 The submission command was wrong in the project2 pdf file.
Please check the latest file when you submit.
- Oct 22 --- Project2 posted. Due
11/8/05 11:59pm.
- Oct 18 Slides of "I/O performance"
posted.
- Oct 17 The dead line of Project1 is postponed
to 23:59:00 on Oct 21st.
- Please add "~heinrich/myusr/local/bin" in your PATH
of .cshrc. Edit your .cshrc accordingly, "source
.cshrc" and try it again.
- Oct 17 Here are some references from Prof. Heinrich's material
of CDA4150 Spring04:
- Oct 16 Project1 Submission
Plese follow these instructions to submit your project1:
- First cd into your lab1 directory (cd ~yourid/cda4150/lab1).
- Edit your README file as described in the projcet file.
Here is an example:
GROUP: hgao1 hgao2
HONGLIANG GAO1
HONGLIANG GAO2
Lab1
- Then run "make clobber"
- At last run "~hgao/cda4150/submit4150 lab1"
- The due time is 23:59:00 on Oct 18th. You can resubmit your
project upto 5 times.
- If you have any problem to submit it, please compress your
files with "tar -czf yourid.tgz *", then email yourid.tgz
to droygao@gmail.com with your id as the subject.
- Oct 12
Quote from Prof. Heinrich's material:
Project 1 Hints -- How do I write an assembly language program
to test my processor (versus a C program)? If you like, you can
write assembly directly, but for this lab you will need to update
a few files. To write assembly language programs for Lab 1, follow
these steps:
cp ~hgao/cda4150/{Makefile,sample.s} ~/cda4150/lab1/test
That's it. To see a sample assembly file, look at the sample.s
file you just copied into your lab1/test directory. To assemble
and run it, in the test/ directory type make sample. Then you
can run test/sample through vcs. At this point you can create
whatever .s files you want, using sample.s as a template and assemble
and run them at will.
What does the following Verilog line do?:
loadData = (Din >> ((~Daddr & 32'h3) << 3)) &
32'hff;
Follow the logic thru and think about what LBU needs to do given
the data address Daddr. Din is the aligned word value containing
the byte described by Daddr. This is a big endian machine. So
the code below looks at the inverse of the bottom two bits of
Daddr (the byte offset within the word). It then shifts Din to
the right by that amount * 8 (some number of bytes, either 0,
1, 2, or 3 bytes to the right). So after isolating the byte you
want in the low order 8 bits, we and the result with 0xff so that
the final result has 24 leading zeroes and the correct low order
8 bits of the proper byte (based on Daddr) within the word Din.
This is what LBU does! Note how LB is slightly different. LHU
and LH will also be similar....but different.
If for whatever reason you decide to edit the comparisons in
qc.v be cautioned that you should NOT use > or >= or <=
comparisons. This instantiates quite a bit of hardware (think
of what that function would do). In qc.v you should use tests
of equality (or inequality) and checks of the sign bit only. I
will discuss in class why the branch comparison logic is especially
latency-sensitive.
Here is an updated link to details on the origins of Big Endian
and Little Endian from Danny Cohen's famous article "On
Holy Wars and a Plea for Peace".
- Oct 10 --- sample .cshrc and .login
files from David M. Lyle. Based on the project's pdf file, there
are following changes in these two files:
change .cshrc to add "~heinrich/myusr/local/bin" and "/mcad/synopsys/vcs6.0/bin"
in PATH
chage a line in .login (then you don't need to source .cshrc anymore):
In the test block for SunOS_ver == 5, change this:
setenv PATH .:/usr/local/bin:/usr/openwin/bin:/usr/dt/bin:/usr/local/hosts:/usr/ucb:/usr/bin:/bin:/usr/ccs/bin:/usr/etc
to this:
setenv PATH .:/usr/local/bin:/usr/openwin/bin:/usr/dt/bin:/usr/local/hosts:/usr/ucb:/usr/bin:/bin:/usr/ccs/bin:/usr/etc:$PATH
Thanks David!
- Oct 08 --- Tip for a common problem for
Project1
- Sep 28 --- Instructions on Cygwin and
virsims posted. virsims is a tool to show waveforms and help you
debug your verilog code.
- Sep 28 --- Project1 posted. Due
10/18/05 11:59pm.
- Sep 28 --- Solution of Project0 posted.
- Sep 28 --- Slides of Lecture 4 - "Vector Processing CRAY
like machines" posted.
- Sep 26 --- Notes on Computer Organization and Architecture by
Dr. Barry Wilkinson(read Lecture 6) posted
- Sep 25 --- Tip for a common problem for
Project0
- Sep 21 --- Project0 posted.
- Sep 20 --- TA's office hours changed.
- Sep 19 --- Midterm exam is on Oct. 4th.
- Sep 19 --- Slides of Bus Architectures is posted.
- Aug 25 --- Syllabus and handout of interrupt handling are posted.
Syllabus
Handouts:
- Interrupt handling
- Flynn's Taxonomy
- Bus Architectures
- verilog_view, verilog_print
- Computer Organization and Architecture by
Dr. Barry Wilkinson(read Lecture 6)
- Lecture 4 - "Vector Processing
CRAY like machines"
- I/O performance
Projects
Page
|