The type checker for Scheme is invoked by loading the file
`type-check-and-eval.scm' into Scheme from a library.
This is usually done automatically by using a special command such as
scm342typed
or scm54typed
, which automatically loads the
appropriate file.
When you run the type checker using a command like
scm342typed
or scm54typed
, the automatically-themloaded
file `type-check-and-eval.scm' starts a new read-eval-print
loop. This loop, reads a Scheme expression from the user's input, type
checks it, and if it does type check, proceeds to evaluate it using the
standard scheme evaluator. If the expression seems to contain a type
error, the user is prompted as to whether he or she wants to proceed
with execution.
Scheme files loaded from the read-eval-print loop, and from other files that are being loaded, are type checked as well. Scheme files may themselves contain type declarations, but such declarations may also be present in a separate file. Of this file is associated with the Scheme file by the use of a different suffix, `.def'. For example, if the Scheme file is named `displayln.scm', the associated `.def' file is named `displayln.def'. This `.def' file is read to supplement the type information in the Scheme file. This system permits the Scheme file to have no type annotations, if desired, which makes the Scheme file more easily portable to another systems that are not using our type checker.
A Scheme file with our type annotations can, however, be used in any
Scheme system. To do this, one must first load the file
`type-check-ignore-types-at-runtime.scm'. This file contains macros
that ignore our type annotation syntax at run-time.
This file is also loaded automatically by
commands such as scm342untyped
or scm54untyped
.
By loading this file, these commands tell Scheme enough about
the type notation to ignore it. Otherwise they act just like Scheme.
The following sections describe the notation for type expressions used in the checker, our additions to Scheme for declaring various kinds of type information, some hints on what type errors may mean, and various configuration parameters and commands recognized by the type checker.
In this section we discuss the notations used in type expressions and their meaning.
A type can be thought of as a set of values with certain associated
procedures. For example we use number
as the name of the type of
all numbers (both exact and inexact, real and integer) in Scheme.
The procedures that work on numbers include + and *.
The type checker uses the following grammar for writing expressions that denote various types.
<type-exp> ::= <basic-type> | <applied-type> | <procedure-type> | <type-variable> | <and-type>
The rest of this section explains each of these pieces of the syntax.
The syntax of a <basic-type> is as follows. See section 2.4 Type Variables for the syntax of <type-variable>. The nonterminal, <symbol>, denotes a symbol as in Scheme [R5RS].
<basic-type> ::= any <symbol> that is not a <type-variable>
A basic type is either a built-in type, such as number
,
or a user-defined type.
Some of the built-in types are the types used for atomic data items in Scheme. These are displayed in the following table.
Built-in Basic Types For Atomic Scheme Data Name Example expressions that have that type ---------------------------------------------------- boolean #t, #f char #\a, #\Z number 0, 342.54, -3 string "a string", "" symbol 'a-symbol, (quote this) port current-input-port
Symbols that are not built-in to the type checker
and that are not a type variables (see section 2.4 Type Variables)
can stand for user-defined types. For example, person
could be
a user-defined type.
The type checker also knows about several other special basic types:
datum
, poof
, and void
.
These are described below.
The built-in basic type datum
is the type of all the values in Scheme.
Every Scheme object has this type; that is, datum
is the supertype
of all types.
One use of datum
is in type checking heterogeneous lists.
In Scheme the elements of a list do not all have to be of the same type;
such a list is heterogeneous.
A list whose elements are all of the same type is homogeneous.
The type checker uses types of the form (list T)
for
lists whose elements all have some type T
(see section 2.4 Type Variables).
For example, a list of booleans such as (#t #f)
has the type (list boolean)
,
and a list of numbers such as (3 4 2)
has the type (list number)
.
For a list such as (#t 3 #f)
,
the type checker uses the type (list datum)
.
Although datum
is the supertype of all types,
the type checker does not infer that an expression has type datum
unless forced to do so [Jenkins-Leavens96].
The reason for this is to give better error messages.
The built-in basic type void
, as in C, C++, or Java,
is used to represent the return type of a procedure
that returns but doesn't return a useful result.
For example, Scheme's procedure display
has the following type.
(-> (datum) void)
The built-in basic type poof
is used for the return types of
procedures that
do not return to their caller. The standard example is the
Scheme procedure error
, whose type is the following.
(-> (datum ...) poof)
The type checker considers poof
to be a subtype of all types
[Jenkins-Leavens96].
The reason for this is to allow code such as the following to type
check.
(if (not (zero? denom)) (/ num denom) (error "zero denominator!"))
The syntax of an <applied-type> is as follows. The notation {<type-exp>}+ means one or more repetitions of the nonterminal <type-exp>.
<applied-type> ::=(
<symbol> {<type-exp>}+)
Applied types denote the result of applying some type constructor,
such as list
or vector
to one or more type
parameters.
For example, we write (list number)
for the type of lists of
numbers. The lists (1 2 3)
and (3.14 2.73)
both have this type.
In the type (list number)
, list
is the type constructor,
and number
is the only type parameter.
The application of a type constructor to one or more type parameters
produces an instance of that type constructor.
Some of the built-in types are the types used for atomic data items in Scheme. These are displayed in the following table.
Example Instances of Built-in Type Constructors Instance Example expressions that have that type ----------------------------------------------------------- (list number) (list 3 4 2) (list boolean) (cons #t (cons #f '())) (vector number) (vector 5 4) (pair number number) (cons 3 4)
Type constructors are symbols, which are never considered to be type variables.
However, they can be user-defined. For example, the symbol stack
could be a user-defined type constructor, and applied types that use
it, such as (stack number)
would be instances.
The syntax of procedure types is given below.
The notation
{<type-exp>}*
means zero or more repetitions of the nonterminal <type-exp>.
The symbol ...
used in the second alternative is a terminal symbol in
the grammar;
it is used in the types of variable argument procedures.
(See section 2.3.1 Variable Argument Procedure Types below for more on these types.
<procedure-type> ::=(
->
(
{<type-exp>}*)
<type-exp>)
|(
->
(
{<type-exp>}* <type-exp>...
)
<type-exp>)
A procedure type denotes the type of a Scheme procedure.
In the notation, the types of the arguments are enclosed in
parentheses,
and these are followed by type that the procedure returns.
For example, the Scheme procedure substring
has the following
type.
(-> (string number number) string)
This means that substring
is a procedure with 3 arguments,
the first of type string
, the second of type number
,
and the third of type number
, which returns a string
.
You can derive such a type from a lambda expression systematically.
Take the lambda expression, replace lambda
by ->
,
replace each formal parameter name by its type,
and replace the body by its type. For example, the type of
the lambda expression
(lambda (n m) (+ 2 (+ n m)))
is the following.
(-> (number number) number)
As another example, the type of sum
in the following definition
(define sum (lambda (ls) (if (null? ls) 0 (+ (car ls) (sum (cdr ls))))))
is the following.
(-> ((list number)) number)
The type checker also understands the types of Scheme's "unrestricted lambda" which can be used to make procedures with a variable number of arguments.
In a procedure type, the last thing in the list of arguments may
be a type expression followed by ...
;
this means that the procedure may take 0 or more actual arguments
of the type immediately preceding ...
.
For example, the type of writeln
the following.
(-> (datum ...) void)
This means that writeln
can take 0 or more arguments (of type datum).
Syntactically, a <type-variable> is either a single-letter symbol,
such as, T
, S
, U
,
V
, and W
, or a symbol that starts with a question mark,
such as ?Elem
.
<type-variable> ::= a <symbol> consisting of a single letter
| a <symbol> starting with ?
followed by Scheme id characters
Type variables are useful for describing the types of
higher-order functions. For such functions,
one often needs to describe procedure types
that are not as specific as the ones used in the previous subsection,
but which are also not so general as datum
.
For this it is convenient to use type variables.
Type variables are consistently replaced in a type when that type is used. For example, Scheme's built in procedure map has the following type.
(-> ((-> (S) T) (list S)) (list T))
Uses of such a type, with type variables consistently replaced by type
expressions, are called instances
of that type.
For example, map
's type above has the following instances.
(-> ((-> (number) boolean) (list number)) (list boolean)) (-> ((-> (number) number) (list number)) (list number))
The first of the above examples is what one would use when checking the types of the following application
(map positive? '(3 -2 1))
because the type of positive?
is (-> (number) boolean)
,
and the type of '(3 -2 1)
is (list number)
,
which makes the argument types match this instance of map
's type.
The second of the above examples would be used in checking the types of the following application.
(map (lambda (n) (+ n 1)) '(3 -2 1))
Can you see why?
Consistency in replacement of the type variables is a necessity. For example, the type
(-> ((-> (number) boolean) (list string)) (list boolean))
is not an instance of the type of map
,
because the type variables S
is not consistently replaced
by a single type expression.
A type containing type variables is said to be polymorphic, because it can generate many instances.
Since datum
is a type expression,
it can be used in an instance to replace type variables in a polymorphic
type expression.
For example, one type for Scheme's cons
procedure is as follows.
(-> (T (list T)) (list T))
This has the type (-> (datum (list datum)) (list datum))
as an
instance.
Besides procedure types, other kinds of type expressions
can also be polymorphic.
An example of a polymorphic applied type is (list T)
,
which is the type of the empty list.
The syntax of an <and-type> is as follows. The notation {<type-exp>}+ means one or more repetitions of the nonterminal <type-exp>.
<and-type> ::=(
and
{<type-exp>}+)
If one thinks of types as sets of values,
then an and type can be thought of as the intersection of several
such sets.
Indeed, "and types" are also called "intersection types" for this reason.
Simply put, if x has type (and S T)
,
then x has both type S
and type T
.
One use of such intersection types is for procedures that can be used
in several ways.
For example, Scheme's cons
procedure
can make homogeneous lists, heterogeneous lists,
and pairs. Thus a better type for cons
than the one given above
is the following.
(and (-> (T (list T)) (list T)) (-> (S (list U)) (list datum)) (-> (V W) (pair V W)))
The type checker considers the order of types listed in an
intersection type when doing type inference [Jenkins-Leavens96].
It will first try to use the first type listed, then the second, and so on.
For example, in the expression (cons 2 '(3))
,
the arguments to cons
are of type
number
and (list number)
,
so the type checker uses the first type in the above
intersection type for cons
, (-> (T (list T)) (list T))
,
and so the type of the expression is inferred to be (list number)
.
However, the type of the expression (cons 2 '(#t))
is inferred to be (list datum)
since there is no instance of the
first type in the intersection type
that matches the argument types.
(This is because T
cannot
be consistently replaced by both number
and boolean
.
See section 2.4 Type Variables for more about consistent replacement.).
Finally, the type of the expression (cons 2 3)
is inferred to be (pair number number)
, since the neither of the
first two types in the intersection type for cons
matches the
arguments.
The type checker never infers that an expression has an intersection
type unless it is forced to do so.
However, one can use deftype
(see section 3.2.1 Deftype) to declare such a
type.
The types of built-in procedures such as cons
make heavy use
of carefully constructed intersection types.
The type checker adds the following syntax to Scheme <expression>s. See the formal syntax in the Revised(5) Report on Scheme [R5RS] for the syntax of Scheme's standard expression, i.e., for other alternative productions for <expression>.
<expression> ::= <has-type-exp> | <has-type-trusted-expr>
The details of this syntax are explained in the following subsections.
The syntax of a <has-type-exp> is as follows.
<has-type-exp> ::=(
has-type
<type-exp> <expression>)
A <has-type-exp> is used to write a type into an expression.
The type checker checks that the expression in the body
of has-type
has the type given, and the run-time value of the
whole expression is the value of the body.
For example, (has-type number 3)
type checks because 3
has
type number
; the value of (has-type number 3)
is 3.
A <has-type-exp> is especially useful with Scheme's let
and letrec
expressions.
An example is the following.
(deftype fact (-> (number) number)) (define fact (lambda (n) (letrec ((fact-iter (has-type (-> (number number) number) (lambda (n acc) (if (zero? n) acc (fact-iter (- n 1) (* n acc))))))) (fact-iter n 1))))
The syntax of a <has-type-exp> is as follows.
<has-type-trusted-expr> ::=(
has-type-trusted
<type-exp> <expression>)
The meaning of a <has-type-exp> is just like a has-type
expression (see section 3.1.1 Has-Type Expressions), but it is type checked differently.
The type checker does not
perform type checking on the expression, but simply accepts the given
type. This allows one to suppress type problems with the type checker,
but should only be used by someone who knows what they are doing.
The type checker adds the following kinds of definitions to Scheme. These are explained in the subsections below. See the formal syntax in the Revised(5) Report on Scheme [R5RS] for the syntax of Scheme's standard kinds of <definition>.
<definition> ::= <deftype> | <define-record> | <define-record-type>
The syntax of a <deftype> is given below. See section 2. Notation for Type Expressions for the syntax of <type-exp>.
<deftype> ::=(
deftype
<name> <type-exp>)
<name> ::= <symbol>
A <deftype> is used to define the type of a name. It can be used anywhere a Scheme <definition> can be used. The type checker will issue an error message if the type inferred for the name does not match the type given.
For example,
one might write the following to define the type of the procedure
remove-first
.
(deftype remove-first (-> (T (list T)) (list T)))
One can also use a deftype to define the type of a variable that is not
a procedure.
For example, the following declares that e
has type
number
.
(deftype e number)
The syntax for a <define-record>, taken from [Friedman-Wand-Haynes92], is given below. The notation {<field-name>}* means zero or more repetitions of the nonterminal <field-name>.
<define-record> ::=(
define-record
<record-name>(
<field-name>}*)
)
<record-name> ::= <symbol> <field-name> ::= <symbol>
In the current version of the type checker, such definitions have no
effect on the type system. To declare the types of a record, and in
addition, the also produce the associated define-record
definitions, one should instead use a define-record-type
definition (see section 3.2.3 Define Record Type Definitions).
The syntax of a <define-record-type> is as follows. See section 2. Notation for Type Expressions for the syntax of <type-exp>.
<define-record-type> ::=(
define-record-type
<record-name> <record-type>(
<field-type-binding>}*)
)
<record-name> ::= <symbol> <record-type> ::= <type-exp> <field-type-binding> ::=(
<field-name> <field-type>)
<field-name> ::= <symbol> <field-type> ::= <type-exp>
A <define-record-type> allows one to declare the types of all the
procedures generated by a define-record
definition at once.
In addition, this definition also produces the associated
define-record
definition (see section 3.2.2 Define Record Definitions,
making it unnecessary to actually use
define-record
directly.
For example, the following definition
(define-record-type student student ((name string) (id number)))
is equivalent to the following set of deftype
and define-record
definitions.
(deftype student? (-> (datum) boolean)) (deftype make-student (-> (string number) student)) (deftype student->name (-> (student) string)) (deftype student->id (-> (student) number)) (define-record student (name id))
(Technically, the type checker puts a begin
around these
definitions, but the same effect is achieved.)
The <record-type> part of a define-record-type
definition
allows one to declare
the types for polymorphic records.
To do this, one uses a polymorphic applied type instead
of a basic type in the <record-type> part (see section 2.1 Basic Types and
see section 2.2 Applied Types).
For example, the following definition
(define-record-type product (product S T) ((left S) (right T)))
is equivalent to the following set of definitions.
(deftype product? (-> (datum) boolean)) (deftype make-product (-> (S T) (product S T))) (deftype product->left (-> ((product S T)) S)) (deftype product->right (-> ((product S T) T)) (define-record product (left right))
A type variable in the <record-type> can also be used as part of more a more complex type expression in a <field-type-binding>. For example, the following definition
(define-record-type converter (converter S T) ((in (-> (S) T)) (out (-> (T) S))))
is equivalent to the following set of definitions.
(deftype converter? (-> (datum) boolean)) (deftype make-converter (-> ((-> (S) T) (-> (T) S)) (converter S T))) (deftype converter->in (-> ((converter S T)) (-> (S) T))) (deftype converter->out (-> ((converter S T) (-> (T) S))) (define-record converter (in out))
Another use of define-record-type
s is in declaring the types for
variant records. A variant record is a union type that is a union of
record types. In this case the <record-type> for each record
in the variant record is set to the name of the union type.
For example, to declare the types for a record-based implementation of binary trees one could write:
(define-record-type interior tree ((key symbol) (left tree) (right tree))) (define-record-type leaf tree ((key symbol) (value number)))
The type checker adds the following top-level forms to Scheme. These are explained in the subsections below. See the formal syntax in the Revised(5) Report on Scheme [R5RS] for the syntax of Scheme's standard top-level forms, i.e., for other alternative productions for <command or definition>.
<command or definition> ::= <defrep> | <public>
The syntax of a <defrep> is as follows.
<defrep> ::=(
defrep
<abstract-type-exp> <representation-type-exp>)
<abstract-type-exp> ::= <type-exp> <representation-type-exp> ::= <type-exp>
A <defrep> form is used to declare the correspondence between an
abstract data type (ADT) and its concrete representation type
[Jenkins-Leavens96].
Because a <defrep> form is syntactically a <command or definition>,
i.e., a top-level form,
it can only appear in certain places inside a program.
These are the top-level of a Scheme program
and within a top-level begin
form
(see the formal syntax appendix of [R5RS]).
When using the <defrep> form, one must also provide
suitable deftype
declarations (see section 3.2.1 Deftype) for the
operations of the ADT must be provided, or the defrep
form will have no effect.
The type checker uses a defrep
form to
allow one to declare, using deftype
,
types for the client-visible operations of an ADT that use
an abstract type, and to check that the implementations of these
operations match, even though they are programmed using some concrete
representation type.
The abstract type in a defrep
form
should be a basic type or an applied type.
The name of the basic type or the type constructor will be
user-defined, not a type variable.
For example, the defrep
form in the following example,
which would be found in a file `ratl.scm',
says that the ADT ratl
is represented by
the type (list number)
.
The deftype
declarations present the client's view of the ADT's
operations.
;;; file ratl.scm (defrep ratl (list number)) (deftype make-ratl (-> (number number) ratl)) (deftype numr (-> (ratl) number)) (deftype denr (-> (ratl) number)) ;;; the following is program 3.21 from p. 91 of [Springer-Friedman89] (define make-ratl (lambda (int1 int2) (if (zero? int2) (error "make-ratl: The denominator cannot be zero.") (list int1 int2)))) (define numr (lambda (rtl) (car rtl))) (define denr (lambda (rtl) (cadr rtl)))
Without the defrep
form, the type checker would infer that
the type of the procedure make-ratl
is as follows,
(-> (number number) (list number))
which would not match the type declared with the deftype
in the above example.
(-> (number number) ratl)
However, with the defrep
form,
the type checker matches the inferred type of the procedure
against a version of the declared type of the procedure
with the representation type substituted for each occurrence of the
abstract type.
Since the only occurrence of the abstract type, ratl
,
in the type of make-ratl
is in the return type,
that is changed to (list number)
when checking against the type
inferred for the implementation.
<public> ::=(
public
{<symbol>}*)
A <public> form allows one to declare which names declared in a
file are visible (exported) to clients.
Because a <public> form is syntactically a <command or definition>,
i.e., a top-level form,
it can only at the top-level of a Scheme program
and within a top-level begin
form
(see the formal syntax appendix of [R5RS]).
If a Scheme file (and its associated `.def' file)
contains no public
forms,
then all names defined in that file are exported
to clients, and thus none are hidden.
For example, in the file `ratl.scm' above (see section 3.3.1 Defrep),
there is no public
form, so all the names defined there
are exported to clients.
For that example, it would be equivalent to write the following.
(public make-ratl numr denr)
If a Scheme file (and its associated `.def' file)
have some public
forms, then only the names listed
are exported to clients, and all other definitions in the file are
hidden.
This would be used, for example, to hide helping procedures that clients
should not need to know about.
A Scheme program file may optionally start with trustme!
.
(See the syntax appendix of the
Revised(5) Report on Scheme [R5RS] for the syntax of
<command or definition>.)
<program> ::= [ trustme!
] <command or definition>*
The trustme!
directive
can only be used at the top-level of a Scheme file
or in a `.def' file
(see section 1. Overview of the Type System).
That is, it cannot be typed directly into the interpreter.
Furthermore, if it appears at all in the file, it must be the first
thing in the file that is not a Scheme comment.
The appearance of trustme!
as the first thing in a file turns off
all type checking for that file. Any deftype
and grouped
deftype
declarations (see section 3.2 Definitions Added To Scheme)
are accepted without comparing them to the
types of their implementations.
This is used to get around limitations of the type checker,
and to make the type checker run faster.
It should not be used by normal users.
Unless the speed advantages of trustme!
are desired,
a file was only a few problems for type checking should instead
use the has-type-trusted
expression
described above (see section 3.1.2 Trusted Has-Type Expressions).
This has the advantage of retaining some type checking by limiting
the sections of the program where type checking is turned off.
(Not written yet.)
The following commands are helpful to regular users of the type checker.
command effect ------------------------------------------------------------------ (type-check-exit) leave the typed interpreter loop (type-check-and-eval-loop) starts (another) typed Scheme interpreter loop (type-check-reset-env!) forget all names defined so far (type-help) displays helpful information (like this)
If you have exited the type checker's read-eval-print loop, you can use
(type-check-and-eval-loop)
to restart it from Scheme.
The following commands are mainly useful for developing the type checker itself.
command effect ------------------------------------------------------------------ (type-tracing-set!) set tracing detail level to given value (type-tracing-off!) set tracing detail level to 0 (type-tracing-level) returns the numeric level of tracing detail
(Not written yet.)
Thanks to Steven Jenkins for work on an earlier version of the type checker and for discussions about these ideas. A more extensive (but dated) discussion of the type checker is found in [Jenkins-Leavens96].
The work of Leavens was supported in part by by the US NSF under grant CCR 9803843. Some of the work on this document was done while Leavens was a visiting professor at the University of Iowa. Thanks to the students at the University of Iowa in 22C:54, section 2, during Fall 2000. Thanks to the students at Iowa State University in Com S 342 in several semesters, most recently Fall 2000.
The following is a summary of the context-free grammar for the type checker. In the grammar, the notation {<type-exp>}* means zero or more repetitions of the nonterminal <type-exp>. The notation {<type-exp>}+ means one or more repetitions of the nonterminal <type-exp>.
<type-exp> ::= <basic-type> | <applied-type> | <procedure-type> | <type-variable> | <and-type> <basic-type> ::= any <symbol> that is not a <type-variable> <applied-type> ::=(
<symbol> {<type-exp>}+)
<procedure-type> ::=(
->
(
{<type-exp>}*)
<type-exp>)
|(
->
(
{<type-exp>}* <type-exp>...
)
<type-exp>)
<type-variable> ::= a <symbol> consisting of a single letter | a <symbol> starting with?
followed by Scheme id characters <and-type> ::=(
and
{<type-exp>}+)
<expression> ::= <has-type-exp> | <has-type-trusted-expr> <has-type-exp> ::=(
has-type
<type-exp> <expression>)
<has-type-trusted-expr> ::=(
has-type-trusted
<type-exp> <expression>)
<definition> ::= <deftype> | <define-record> | <define-record-type> <deftype> ::=(
deftype
<name> <type-exp>)
<name> ::= <symbol> <define-record> ::=(
define-record
<record-name>(
<field-name>}*)
)
<record-name> ::= <symbol> <field-name> ::= <symbol> <define-record-type> ::=(
define-record-type
<record-name> <record-type>(
<field-type-binding>}*)
)
<record-name> ::= <symbol> <record-type> ::= <type-exp> <field-type-binding> ::=(
<field-name> <field-type>)
<field-name> ::= <symbol> <field-type> ::= <type-exp> <command or definition> ::= <defrep> | <public> <defrep> ::=(
defrep
<abstract-type-exp> <representation-type-exp>)
<abstract-type-exp> ::= <type-exp> <representation-type-exp> ::= <type-exp> <public> ::=(
public
{<symbol>}*)
<program> ::= [trustme!
] <command or definition>*
This document was generated on 1 January 2001 using texi2html 1.56k.