COP-5621, Spring 2024

What do you want to get out of this course?

  • Intros
  • Why are you taking this course?
  • What do you hope to learn?
  • about me, research, positions
    • favorite course
    • program analysis and transformation, tool engineers
    • use course project to gauge interest

What I hope you will get out of this course

  • Deeper understanding of compiler techniques
  • See compiler techniques are used in research

Symbols vs. meaning

1 + 2 * 3

6 / 2(1 + 2)


What gives symbols meaning?

  • the compiler itself defines meaning
  • defines input language in terms of output language
  • (why does output language have meaning?)

ASCII vs. machine code

What is a compiler?

  • What do you think?

Classic phases of a compiler


Dragon book


  • Language-processing
    • Remember regexes, grammar from discrete? (If you took it)
  • Type checking
    • Did I assign an array to a float?


  • Code generation
    • Equivalent machine code for any valid input program
  • Optimization
    • Choose equivalent machine code that's faster, smaller, etc.


  • Intermediate code
    • Like machine code
    • Without machine specifics
    • Lots of different designs

Why an intermediate representation?

Multiple front-ends and back-ends


What makes compiler-writing hard?

Compilers take an entire program as input

Thinking about three programs at once

  • Input program
  • Output program
  • Compiler itself

Interpreters vs. Compilers

Compilers translate

They do not provide you the output


  • Input: program
  • Output: (equivalent) program

Interpreters execute

They do give you the output


  • Input: program
  • Output: value

Single execution vs. all executions

Example program

x = input()
a = 5
b = a - 3
if x == "yes":
  b = b * b - 2

Example execution


  • input: yes
  • output: 4
  • one input represents one execution trace (in deterministic programs)
    • trace: one sequence of state changes, steps through the program
  • are traces always unique?
  • is a trace always finite?

All executions


  • set of all traces
  • can we compute them all?
  • is the set of all traces always finite?

Compilers must consider all (possible) traces

  • Why?
  • What about impossible traces?
    print("hello, world!")
    while True:
    print("hello, world!")
    • what can we do with impossible traces?
    • can we always determine impossible traces?

Example: translate to machine code

x = input()
a = 5
b = a - 3
if x == "yes":
  b = b * b - 2


  • pick any machine-code-like language (finite registers, single arithmetic operations, branches, etc.)


