Type-Checking
Lecture 8

Overview
Why use types?
Type vs. untyped languages
Type-safe vs. unsafe
When do we check types?
Execution errors and well-behaved programs
Demo: C vs. Python
Static type checking
Function types
Safety guarantees
Proving type soundness
Implementation
Symbol table
Demo: statically checking a tree
Project

Overview

What are types?
Why have them?
How to implement them?

Why use types?

One use: to prevent errors during runtime

How do you feel about types? Do you like having the protection? Do compiler errors bother you?

Type vs. untyped languages

A type is
- a set of values
- and operations on those values

Examples

int
- the set of integers and the arithmetic operations
bool
- { true, false } and the logic operators (and, or, not)

Typed languages

Restrict a variable's range of possible values (Python, C, Java, etc.)

Untyped languages

Do not restrict variable values (Lisp, assembly)

Type-safe vs. unsafe

Runtime errors are either trapped or untrapped
Trapped
- Machine catches and terminates program, e.g., NULL-pointer error, divide-by-zero
Untrapped
- Program continues, e.g., writing past array bounds, integer arithmetic on floating point number

Untrapped are nefarious, because you may not noticed until program has wrong behavior in some tested input of the program.

A safe language

Prevents untrapped errors (and some trapped errors).

When do we check types?

Compile-time (static): C, Java
Run-time (dynamic): Python, Java(?)

Statically-checked vs dynamically checked?

Java employs static and dynamic checks, e.g.,

checking whether a symbol is an array: static check
checking whether an array access is out-of-bounds: runtime check

Java reflection: check type at run-time (instanceof)

Why not always do static checks?

What else can we define in the type system?

Memory-safety: pointer is never dereferenced when NULL
Information flow security: does a secret value ever get printed out?

Execution errors and well-behaved programs

Forbidden errors: all untrapped errors (and some trapped)
Good behavior: a program has no foridden behaviors

Weak vs. strong checking

Strongly-checked: all legal programs have good behavior
Weakly-checked: some programs violate safety

Why would we want weak typing?

What the trade-off between strong/weak typing and decisions about what to checK?

Source: http://lucacardelli.name/Papers/TypeSystems.pdf

Cardelli considers Lisp untyped, because it does not restrict variables to a range of values. Untyped languages "do not have types or, equivalently, have a single universal type that contains all values. In these languages, operations may be applied to inappropriate arguments: the result may be a fixed arbi- trary value, a fault, an exception, or an unspecified effect."

Demo: C vs. Python

Static type checking

Record types of identifiers in symbol table
Post-order tree traversal
Check identifiers used in
- Arithmetic operators, function calls, assignments
Lookup type in symbol
Constants have a fixed type
- 3 is an int
- 5.2 is a float
- True is bool (though C itself has no bool)

Function types

Scalar values have primitive type
- int, char, long, etc.
If symbol "x" has type "int", we can write
```
x : int
```
Function types describe parameters, return values
- E.g., f takes two integers and returns a bool
```
f : (int, int) -> bool
```
What is the type of a arithmetic multiplication (*)?
```
* : (int, int) -> int
```

Safety guarantees

If type checker accepts a program is it actually safe?
type soundness: checkers says safe, then program is safe
Example: array out of bounds access
- unsound: C type checker accepts the program
- sound: Java type checker rejects the program (at runtime)

Proving type soundness

Goal: well-typed program are safe program
Need to define semantics first
Define type rules that "run" over the semantics

Formal soundess: each provable sentence (well-typed program) is valid with respect to semantics (safe program)

Implementation

Symbol table

Mapping variables to types and memory locations

Example symbol table

main {
  char x;
  int res;
  x = 'a';
  res = 5;
  return 0;
}

symbol	type
x	char
res	int

Demo: statically checking a tree

int x;
int y;
return x+y;

int x
char c;
return x + c;

int x;
int y;
int z;
return x + y * z;

Traverse the tree
Apply different rule for each type of node
Declaration nodes: add to symbol table
- Developer annotations prime the compiler with the type operations to expect
Expression and assignment nodes: check against types in the symbol table
Constants have predefined types
Operators have predefined (function) types

Type-Checking Lecture 8

Table of Contents