Type checking
Lecture 10
Table of Contents
- Informal type specification
- Compile-time vs. run-time behavior
- Choose your own adventure
- Typing of expressions
- Formalizing the type specification
- Type checker preparation
- Implementing type collection
- Implementing type checking
- Putting it all together
- Type-checking is a special kind of evaluator
- Type-checking is a proof of type correctness
- (Ungraded homework) Prepare for next class
Informal type specification
Declarations
- Restrict symbols to a specific type for whole program
- Static typing
- No change to type at run-time
- Can determine at compile-time
f() { } main() { int y; y = 2; // okay, because y and 2 are both int y = f; // not okay, because y is an int and f is a function }
What is a type?
How can we remember the types of symbols?
Scoping
- Symbols valid within a user-defined range
- Corresponds to functions (or compound statements) in our grammar
- Called lexical or static scoping
- No change to scoping at run-time
- Can determine scopes at compile-time
f() { bool y; y = true; // okay, because y is a bool in f's scope } main() { int y; y = true; // not okay, because this is a different "y" }
What other scoping choices could we make?
Can we have the same symbol with different types somewhere in the program?
What happens with nested scopes?
How can we remember what symbol belongs to what scope at compile-time?
The symbol table
- Maintains mapping from symbols to types
- Maintains scoping information
- Namespacing
- Can we have functions, variables, structs, etc. of same name?
Example
- (In-class diagram)
Look at ASTs for the above examples
Maintain symbol table and scope stack
Cases
- One declaration and one usage (also show type error)
- One function in global scope and use in nested scope (also show type error)
- Variable shadowing in nested scopes (not used in SimpleC) (even if same type, shadowed variable retains value after nested scope exits)
Compile-time vs. run-time behavior
- Notice that these design choices are convenient for checking at compile-time
- Symbol never changes type
- Scope is based on grammatical structure (not control-flow)
- Can we have run-time-style typing in a compiled language?
- No type declarations?
- Change type in different executions of program?
- Access static scope after leaving function?
Not type declarations: type inference
Change type in different program executions: subtyping is one way, e.g., subclasses, generics
Accessing static scopes after control-flow exits scope: closures (functions bundled with static scope's run-time variable bindings)
Choose your own adventure
- Start implementing type-checking for declarations?
- Move on to the typing of expressions?
Typing of expressions
What is a type?
Set of values and operations on those values
Type rules for operations
- Only allow operations on symbols of the same type
- Why do this?
What about operations between types?
- Multiplying float and int?
10 + 3.5
- Why does this work in C, python, etc.?
- Adding int and function?
- Is this every useful?
- What about booleans?
- What are booleans at CPU level?
x86, for instance, has different opcodes for integer and float operations https://www.felixcloutier.com/x86/add
https://www.felixcloutier.com/x86/addps
with C and x86, for instance, boolean operations are actually done with conditional jumps (rather than bitwise arithmetic) https://www.felixcloutier.com/x86/jcc
https://stackoverflow.com/questions/24542/using-bitwise-operators-for-booleans-in-c
Formalizing the type specification
Type checker preparation
- Setting up our visitors
- What are the return types and why?
Our design uses separate statement and expression visitors, reflecting the different types of semantic rules in our language.
Implementing type collection
Symbol table
Scoping
Collecting declarations
Getting types of variable usages
Implementing type checking
Leaf nodes
- Which ones are they?
- What are their types?
Inner nodes
- What are they?
- What are their types?
Putting it all together
Example
f() { return 10; } main() { int x; input x; return x * 2 + f(); }
Type-checking is a special kind of evaluator
- Just like the Calc example
- Tree traversal (with visitor in our case)
- Starts from leaf nodes' values
- Recursively accumulates values
Type-checking is a proof of type correctness
- Leaves are axioms
- Defined ahead of time, not concluded
- Inner nodes (functions or operations) are implications
- Given child nodes truth, can conclude current nodes truth
Functions act like implication
- Example: integer multiplication, (int, int) -> int
- If operands are int, we can conclude the result is an int
- Otherwise, we can't conclude result is int (and there could be a run-time issue)
Does type correctness always imply a correct program?
Does a lack of type correctness always imply an bug or incorrect program?
(Ungraded homework) Prepare for next class
You may use the skeleton visitor code as a starting point for your type checker.
To start working on the type-checker, be sure to uncomment the type-checking phase fr
om Compiler.java
// Phase 2: Type checking. TypeChecker typechecker = new TypeChecker(); typechecker.visit(tree);
and from the Makefile
TypeChecker.java \
(the trailing slash denotes a line-continuation that is meant to be there for the variable assignment.)