UP | HOME

Introduction
COP-3402

Table of Contents

About me

  • Paul Gazzillo
  • Associate professor
  • Research areas
    • Software engineering
    • Programming languages
    • Security
  • https://paulgazzillo.com

Why study systems software?

Know your tools

newton_grinding.png

Newton's Notebook on Grinding Lenses

all technical workers need to be expert at their tools. pilots go to flight school, carpenters may even build their own tools.

when studying optics, newton learned to create his own lenses. here is an image from newton's notebook on grinding lenses.

https://cudl.lib.cam.ac.uk/view/MS-ADD-04000/56

"A poor workman blames his tools." - proverb

"Tools are made for man, not man for tools." - me (?)

"Technology is made for man, not man for technology" - Aldous Huxley 1961, Thomas Merton 1966

"Don't use your [bare] hands, use a tool!" - my dad

  • Need to learn why the tools are useful to be effective
  • Don't just use a tool b/c you think it will magically make you do a better job
  • Modify your tools! We'll use open-source, configurable tools in this class

Course goals

Learn to (1) use and (2) create systems software

What is systems software?

The bridge between kernel and applications

applications.jpeg

Applications
Systems software
Kernel
Hardware

hardware_named.jpeg

Debates on what an operating system is

  • netscape microsft antitrust lawsuits
  • the GNU/Linux stallman argument
  • shared history of GNU and Linux

https://developer.ibm.com/articles/l-linux-kernel/

https://makelinux.github.io/kernel/map/intro

Example systems software

  • Libraries: stdio, malloc, etc.
  • Programming toolchains: compilers (gcc), linkers (ld), etc.
  • Programming environments: building (make), versioning (git), etc.

Course goals: (1) use system software

bash.png

Be conversant in the command-line

The command-line opens you up to the full power of the operating system

You can download and run many more applications that lack a GUI

You can perform powerful automation, for instance, in our lab we script experiments so that others can replicate for themselves our results

The Linux kernel is over 20 million lines of code, has thousands of developers and maintainers and is used in billions of devices

You can develop for it, including submitting patches, entirely on the command-line, and many do

Among other things, allows you to build up sophisticated automation from simple tools. There are tons of tools, many of which are standard, that you have access to on a *nix system that make it so you don't have to reinvent the wheel.

Basics of the Unix Philosophy

Lisp: Good News, Bad News, How to Win Big

Git-Logo-2Color.svg

Know how to use version control

If you are doing software development, you will almost certainly be using version control.

Version control not only makes large-scale collaboration on a single codebase possible, but it helps you as an individual organize your code, debug, focus, and more.

file_system.svg

Understand the file system

("Where are my files?")

Too many students go through CS without basic understanding of hierarchical file systems.

Modern graphical OSes hide much of it for convenience. But nearly all devices have some sort of file system.

The hierarchical file system was designed by people (you can see an explanation in the reading for next time), so don't take for granted everyone just knows. You need to learn it.

gnu-linux.png

Be familiar with GNU/Linux development toolchains

This classic toolchain is used for thousands of software tools: there are well-made tools for writing, building, testing, deploying, and sharing software.

Development can be done entirely on the command-line with this toolchain.

Working with command-line tools shows you the fundamental aspects of software development. If you can master these, you'll only be that much better at the GUIs and IDEs.

Together with the command-line, this will open you up to a vast, powerful world of computing that allow you to assembly complex computing applications built from existing pieces, like legos.

Course goal: (2) build systems software

File-handling systems software

  • Working with the file abstraction
  • Manipulating the file system
  • Reading/writing files

Process management

  • Creating processes
  • Executing programs
  • Manipulating I/O

Compilers

  • Work with language processors
  • Generate machine code
  • Interoperate with existing machine code standards

Ulterior educational motive

  • Make everyone a better programmer

The tortoise and the hare

tortoise_and_hare.jpeg

tortoise.jpeg

The tortoise and the hare are both given a challenging programming project.

hare.jpeg

hare.jpeg

The hare immediately starts coding.

After all, finishing the coding is done, so the faster we write code, the faster we finish.

tortoise.jpeg

The tortoise starts with comments, planning, and tests before even a single line of code gets written.

hare.jpeg

The hare finishes coding before the tortoise even starts.

But now the real work begins….

hare.jpeg

The hare runs a large test given by the professor.

It breaks the program of course.

hare.jpeg

The hare starts guessing and changing code that might be broken.

hare.jpeg

But now another test is breaking.

Fixing that bug breaks the first test.

hare.jpeg

If the hare is lucky, all given tests work eventually.

hare.jpeg

But the hare is surprised at a bad grade, because new tests written for grading also break.

tortoise.jpeg

The tortoise, meanwhile, starts by dividing the problem down into simpler pieces.

tortoise.jpeg

He writes some simple examples to illustrate the problem.

tortoise.jpeg

He takes an easy piece of the problem and writes comments about what the code should do.

Only then does he code, little by little, testing along the way.

tortoise.jpeg

The tortoise makes sure simple things work properly before moving on.

tortoise.jpeg

He doesn't take for granted that the code just works.

He knows debugging is really hard, especially when there are lots of bugs together.

This is what the hare found out the hard way when new fixes broke old tests.

tortoise.jpeg

The tortoise took way longer to get a line of code written.

But when he got there, the code was much easier to write and write correctly.

  • illustration, race track (end goal, correct program)
  • hare: coding phase short, debugging and testing very long
  • tortoise: planning and testing are long, coding very short, brief new testing

tortoise.jpeg

While the hare felt like he was "coding" fast, the tortoise was calmly

  • writing tests
  • writing comments on what he was doing, and
  • writing a few lines of carefully-thought-through code at a time.

tortoise.jpeg

When he runs all the tests from the professor, they almost all work.

For the ones that don't, he just isolates the problem with a minimized test.

tortoise.jpeg

His code is well-commented, simple, and organized, so he has no issue finding where in the code the problem is and he fixes it.

His grade is good, because even unseen tests are likely to work.

Be the tortoise.

Caveat, there are absolutely times to be the hare. Hacking a simple bash script, getitng something done fast, small programs, inconsequential programs. But really hard problems, getting code correct (so a plane doesn't crash), needs you to be the tortoise.

How to be the tortoise

  • Know how your programming language works, don't guess what your program is doing
  • Know your programming environment well, develop fast workflows
  • Break the problem down before coding, use abstractions (like functions) to organize your code into pieces
  • Make sure simple stuff works before moving on
  • Write your own tests, and save them (don't overwrite your file, you've lost an opportunity to retest!)
  • Develop good debugging skills

We'll have lots of opportunities to practice these principles during this class.

kernighan.jpeg

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?

Brian Kernighan, The Elements of Programming Style, 2nd edition, chapter 2

What's a bug?

What do you think?

What a bug is not (usually)

  • The program does something wrong

What a bug is

  • A program that does what it's supposed to do, but not what we think it does

A bug is the gap between what the programmer thinks it does and what it actually does

Debugging schema

  1. Narrow down the problem by crafting a smaller version of the test case
  2. Go to the part of the your code that is related to the problem
  3. Trace step-by-step what your code really does (not what you think/hope/feel/guess/divine/intuit/reckon it does)

Don't start hacking code! First understand the problem.

Once you see the discrepancy between actual code behavior and what you want the code to do, the fix will likely be readily apparent (at least in this class's projects).

Goal: make the specification match the implementation

  • Specification: what we want our program to do
  • Implementation: what our program really does
  • Easier said than done!
  • During implementation, the incomplete implementation never matches the specification!

(Diagram)

  • Specification -> Programmer -> Implementation
    • (Green highlight entire spec, green highlight part of impl, but mostly red)
    • If I try to tackle the whole problem at once, my program is always wrong until I've finished coding the entire specification

Hard way: write whole program, check once done

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?" –Brian Kernighan, "The Elements of Programming Style", 2nd edition, chapter 2.

Sounds easy. Seems easy at first. Quickly snowballs.

The more you add to your code the more complicated combinations of behavior. It's exponential!

Think about debugging code that has 3 if-then-else statements vs 10.

Lazy way: start with a narrower specification

  • Easier to get the implementation right
  • Gradually expand the specification
  • Keep the implementation correct at each step

I will write a "hello, world!" program as the first step.

top-down vs bottom-up

Constructively lazy (one of the virtues!)

Divide-and-conquer: break the problem down into smaller parts

  • Can't keep all code in mind for all time
  • Make it easier for yourself
  • Delay gratification
    • Finishing code fast feels good
    • Debugging shoddy code feels horrible

(Diagram)

  • Specification -> Programmer -> Implementation
    • (Divide and green highlight part of spec, divide and green highlight part of impl; then take one more piece of spec, green highlight corresponding impl, with a small part being red)
    • But if I divide up the specification, I can focus on getting a simpler, smaller (sub)program done, making debugging easier and it more likely to be correct once I move onto to another part of the specification.
  • Debugging is made simpler, since there are fewer (likely) things that could be wrong

Use functions and separate compilation to (loosely) organize code into divisions of the problem.

Spend time now, save time debugging later

  • Breaking problems down takes time, experience, and making mistakes
  • Premature generalization can waste work
  • Don't be afraid to refactor
    • Easier to reorganize code after prototyping than to write from scratch (for large programs)

Instead of using cognitive energy to keep the whole program in your head and debug the whole program each time there is an issue, use cognitive energy to break the problem down into simpler parts and reason about how to combine them correctly

Within each phase, try to break the problem down further yourself, and make each piece work on its own, then work with other pieces gradually.

Biggest take aways:

  • Real definition of what a bug is
  • Revisit your code
    • Refactor to match specifications (usuallyeasier to refactor than starting from scratch)
    • Take time to make good interfaces
    • Less stressful if breaking down the problem first

Wirth’s Stepwise Refinement

Three virtues of a great programmer

https://web.archive.org/web/20211014194234/http://threevirtues.com/

"According to Larry Wall(1), the original author of the Perl programming language, there are three great virtues of a programmer; Laziness, Impatience and Hubris

  1. Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it.
  2. Impatience: The anger you feel when the computer is being lazy. This makes you write programs that don't just react to your needs, but actually anticipate them. Or at least pretend to.
  3. Hubris: The quality that makes you write (and maintain) programs that other people won't want to say bad things about."

Differences from prior courses

Fall 2024 -> Spring 2025

  • Tweaks to course that was redesigned last semester
    • Got better feedback than any previous offering
  • Consolidated content on file and processes into a single section each
    • Removes duplication, frees time for harder content
    • More time on systems programming, especially processes
    • Smaller set of class modules
  • Various improvements to course material and explanations
    • Based on last semester timing and student questions
  • Now we get to the command-line and git faster
    • Gets students prepared earlier for required tools
  • Compiler project now in C/C++ (instead of python)
    • Native support on eustis (no more pipenv)
    • Less uncertainty in having to learn python
  • Separated "exercises" from "projects"
    • Projects are long-term implementation
    • Exercises are simple tasks to learn course tools (git and an antlr exercise)
  • Early git exercise (to get students better prepared to use it)
    • Critical to being able to submit projects
  • New systems software projects (files and processes)
  • Consolidated compiler projects (3 instead of 4)
  • Improved compiler project descriptions and grading
  • More in-class examples
  • Separated homework exercises from readings
    • Readings are listed separately
    • Homeworks include more exercises
  • In-class questions can be asked in edstem to reduce tangents in lectures
  • Attendance now a (small) part of the grade
  • UCF here for attendance
    • Faster attendance taking for exams
  • Exams using webcourses quizzes
    • Instant grading for some questions
    • Easier interface

Still getting comments about interruptive questions, tangents

2024 Spring -> Fall 2024

  • Course redesign
  • Expanded content on core systems software (file systems, command-line, programming environment)
  • Broader range of topics
  • Easier programming projects, but more of them
  • Late project submission still allowed, but only after making a first submission on time

Changes from my last offering (2020 -> 2024 Spring)

  • Competency questions for each lecture
    • Will have homework questions each week (graded for effort)
    • Final will have similar questions
  • Staying on topic, fewer tangents, leaving student discussions outside of lectures
    • 1hr lectures, 15min of questions, discussion per session
  • 2020 offering was quite positive
  • two comments for improvements
    • i get easily distracted by questions
    • specific questions at end of course to focus students on important takeaways

Changes from my last offering (2019 -> 2020)

  • More in-class coding and demos
  • Lexer/parser given to you
  • More breathing room to cover complex topics
  • students liked in-class coding, doing more of that
    • toy compiler, code together
    • submit your copied version
  • liked learning systools, particularly git
    • need more instruction on using the command-line (since i expect it)
    • devoting part of course material to it
  • students did not like lack of detail on theoretical concepts

Syllabus overview

Author: Paul Gazzillo

Created: 2025-01-10 Fri 15:59

Validate