UP | HOME

(4) Compiler-Compilers, SimpleC
COP-3402, Spring 2024

Table of Contents

Compiler architecture review

compiler.jpg

Dragon book

Toy compiler

  • Lexer
  • Parser
  • Code generator
  • (Diagram)

lexer.c, ast.c, parser.c, codegen.c -> compiler.exe -> compiler.exe

main thing to understand: our compiler writes out assembly code, it doesn't run the program

Compiler-compilers: automating compiler generation

  • Specify tokens and grammar
  • Compiler-compiler automatically generates lexer and parser
  • (Diagram)

(lexer.l -> lexer.c), ast.c, (parser.y -> parser.c), codegen.c -> compiler.exe -> compiler.exe

  • saw details of various parts of the compiler
  • saw how we can make the specification more specific, systematic, rigorous
  • turns out that we can use this specification to generate those parts of the compiler
  • lexer.l -> lexer.c, parser.y -> parser.c; lexer.c, parser.c, codegen.c -> compiler.exe

Lexing

  • Toy compiler's lexer.c
    • looped over each character
    • matched characters to tokens
    • created token data structure

The code handle file I/O, reads one-character-at-a-time, groups characters into tokens (number, operator, identifier, etc)

Automatically generating lexers

have you ever seen regular expressions or wildcards?

The flex tool

  • Specify pattern for each token
  • Provide code snippet for action to take on pattern

Toy compiler's lexer.l

Parsing

  • Toy compiler's parser.c
    • called lex() to get tokens one-at-a-time
    • matched syntax tree recursively
    • created AST nodes for each language construct

Automatically generating parsers

labs introduced grammars

The bison tool

  • Specify grammar of each construct
  • Provide code snippet for action to take on each construct

Toy compiler's parser.y

bison specification in detail

Revisiting Makefiles

Example: Makefile from toy compiler 2

Generate lexer and parser first

  • Build dependency graph
  • (Diagram)
See the dependency tree defined by the Makefile

See how the files are generated in order to ensure dependencies are met

Toy compiler with flex and bison

  • Update the parser and lexer

Toy compiler with a generated lexer and parser

Source code

https://github.com/cop3402/toy2

Compare and constrast the lexer, parser, and code generator

Lexer and parser are generated from specification files (no need to program them in C)

Code generator works the same way, following the parse tree (which is identically created in both versions of the compiler)

(Break)

SimpleC

  • Example program
  • Informal description
  • Formalizing the description

Example

int x;

f(a, b) : function(pointer<int>, char) -> char {
  return *a + (int) b;
}

main {
  int x;
  pointer<int> y;
  char c;
  y = &x;
  c = 't';
  return 'c';
}

Language constructs

  • Expressions
  • Statements
  • Declarations
  • Function definitions

Going to assume you know C for this. The project will help learn more about those constructs when we implement them.

Expressions

  • A subset of C
  • Arithmetic operators: + - * /
  • Pointer reference and deference: & * []
    • Recall that arrays are just pointers in C
  • Relational operators: < > <= >= =!
  • Boolean operators: && ||

Statements

  • A subset of C
  • if
  • while
  • assignment

Assignment is actually an expression in C, e.g., a = b = c;

Declarations

  • Subset of C's types
  • Different syntax for types
  • Two kinds of types
    • Primitive types: int, char
    • Compound types: pointer, array, function

IMHO SimpleC makes type declarations more explicit, less esoteric

Declaration examples

int x;
pointer<int> y;
array<5, int> z;
pointer<pointer<char>> pointer_to_string;

Function definitions

  • New syntax
  • Makes function type a little more explicit
  • Function types

    (param_type1, param_type2) -> return_type
    
  • Must have return statement at the end
    • No void functions

Function definition examples

square(a) : function(int) -> int {
  return a * a;
}

Function definition examples

multiply(a, b) : function(int, int) -> int {
  return a * b;
}

Function definition examples

increment_array(array, size, amount) : function(pointer<int>, int, int) -> pointer<int> {
  int i;
  i = 0;
  while (i < size) {
    array[i] = array[i] + amount;
    i = i - 1;
  }
  return array;
}

main function

  • SimpleC has a mandatory main function
    • Differs from C, no separate compilation/linking
  • Must be at end of source file
  • No type or parameter specification
    • These will be built-in

main is part of the C runtime library. Syntactically, main is just another function definition, while the runtime ensures that it is the first thing called after initialization. See lecture 02 on how source becomes an executable.

Example revisited

int x;

f(a, b) : function(pointer<int>, char) -> char {
  return *a + (int) b;
}

main {
  int x;
  pointer<int> y;
  char c;
  y = &x;
  c = 't';
  c = f(y, c);
  return 'c';
}

Nice talk on language design

Author: Paul Gazzillo

Created: 2024-02-02 Fri 15:51