A base table implementation using Scheme association lists is available
as the value of the identifier alist-table
after doing:
Association list base tables are suitable for small databases and support all Scheme types when temporary and readable/writeable Scheme types when saved. I hope support for other base table implementations will be added in the future.
This rest of this section documents the interface for a base table implementation from which the section Relational Database package constructs a Relational system. It will be of interest primarily to those wishing to port or write new base-table implementations.
All of these functions are accessed through a single procedure by
calling that procedure with the symbol name of the operation. A
procedure will be returned if that operation is supported and #f
otherwise. For example:
(require 'alist-table) (define open-base (alist-table 'make-base)) make-base => *a procedure* (define foo (alist-table 'foo)) foo => #f
#f
is returned.
Calling the close-base
method on this database and possibly other
operations will cause filename to be written to. If
filename is #f
a temporary, non-disk based database will be
created if such can be supported by the base table implelentation.
#t
, this database will have methods capable of
effecting change to the database. If mutable? is #f
, only
methods for inquiring the database will be available. If the database
cannot be opened as specified #f
is returned.
Calling the close-base
(and possibly other) method on a
mutable? database will cause filename to be written to.
close-database
(and possibly other) method on lldb may
cause filename to be written to. If filename is #f
this database will be changed to a temporary, non-disk based database if
such can be supported by the underlying base table implelentation. If
the operations completed successfully, #t
is returned.
Otherwise, #f
is returned.
#f
, no action is taken and #f
is returned. If this
operation completes successfully, #t
is returned. Otherwise,
#f
is returned.
#t
is returned. Otherwise, #f
is returned.
#f
. The base table can then be opened using (open-table
lldb base-id)
. The positive integer key-dimension is
the number of keys composed to make a primary-key for this table.
The list of symbols column-types describes the types of each
column.
open-table
. catalog-id will be used as the base table for
the system catalog.
#f
.
As with make-table
, the positive integer key-dimension is
the number of keys composed to make a primary-key for this table.
The list of symbols column-types describes the types of each
column.
#t
if the base table associated with base-id was
removed from the low level database lldb, and #f
otherwise.
Any 2 arguments of the supported type passed to the returned function
which are not equal?
must result in returned values which are not
equal?
.
Any 2 lists of supported types (which must at least include symbols and
non-negative integers) passed to the returned function which are not
equal?
must result in returned values which are not
equal?
.
(make-list-keyifier key-dimension types)
.
This procedure returns a key which is equal?
to the
column-numberth element of the list which was passed to create
combined-key. The list types must have at least
key-dimension elements.
(make-list-keyifier key-dimension types)
.
This procedure returns a list of keys which are elementwise
equal?
to the list which was passed to create combined-key.
In the following functions, the key argument can always be assumed to be the value returned by a call to a keyify routine.
In contrast, a match-key argument is a list of length equal to the number of primary keys. The match-key restricts the actions of the table command to those records whose primary keys all satisfy the corresponding element of the match-key list. The elements and their actions are:
#f
- The false value matches any key in the corresponding position.
- an object of type procedure
- This procedure must take a single argument, the key in the corresponding position. Any key for which the procedure returns a non-false value is a match; Any key for which the procedure returns a
#f
is not.- other values
- Any other value matches only those keys
equal?
to it.
#f
value if there is a row associated with
key in the table opened in handle and #f
otherwise.
#f
otherwise.
#t
if symbol names a type allowed as a column value
by the implementation, and #f
otherwise. At a minimum, an
implementation must support the types integer
, symbol
,
string
, boolean
, and base-id
.
#t
if symbol names a type allowed as a key value by
the implementation, and #f
otherwise. At a minimum, an
implementation must support the types integer
, and symbol
.
integer
symbol
boolean
#t
or #f
.
base-id
open-table
. The value of catalog-id must be an acceptable
base-id
.
(require 'relational-database)
This package implements a database system inspired by the Relational Model (E. F. Codd, A Relational Model of Data for Large Shared Data Banks). An SLIB relational database implementation can be created from any section Base Table implementation.
Most nontrivial programs contain databases: Makefiles, configure scripts, file backup, calendars, editors, source revision control, CAD systems, display managers, menu GUIs, games, parsers, debuggers, profilers, and even error reporting are all rife with databases. Coding databases is such a common activity in programming that many may not be aware of how often they do it.
A database often starts as a dispatch in a program. The author, perhaps because of the need to make the dispatch configurable, the need for correlating dispatch in other routines, or because of changes or growth, devises a data structure to contain the information, a routine for interpreting that data structure, and perhaps routines for augmenting and modifying the stored data. The dispatch must be converted into this form and tested.
The programmer may need to devise an interactive program for enabling easy examination and modification of the information contained in this database. Often, in an attempt to foster modularity and avoid delays in release, intermediate file formats for the database information are devised. It often turns out that users prefer modifying these intermediate files with a text editor to using the interactive program in order to do operations (such as global changes) not forseen by the program's author.
In order to address this need, the conscientious software engineer may even provide a scripting language to allow users to make repetitive database changes. Users will grumble that they need to read a large manual and learn yet another programming language (even if it almost has language "xyz" syntax) in order to do simple configuration.
All of these facilities need to be designed, coded, debugged, documented, and supported; often causing what was very simple in concept to become a major developement project.
This view of databases just outlined is somewhat the reverse of the view of the originators of the Relational Model of database abstraction. The relational model was devised to unify and allow interoperation of large multi-user databases running on diverse platforms. A fairly general purpose "Comprehensive Language" for database manipulations is mandated (but not specified) as part of the relational model for databases.
One aspect of the Relational Model of some importance is that the "Comprehensive Language" must be expressible in some form which can be stored in the database. This frees the programmer from having to make programs data-driven in order to use a database.
This package includes as one of its basic supported types Scheme
expressions. This type allows expressions as defined by the
Scheme standards to be stored in the database. Using slib:eval
retrieved expressions can be evaluated (in the top-level environment).
Scheme's lambda
facilitates closure of environments, modularity,
etc. so that procedures (which could not be stored directly most
databases) can still be effectively retrieved. Since slib:eval
evaluates expressions in the top-level environment, built-in and user
defined procedures can be easily accessed by name.
This package's purpose is to standardize (through a common interface) database creation and usage in Scheme programs. The relational model's provision for inclusion of language expressions as data as well as the description (in tables, of course) of all of its tables assures that relational databases are powerful enough to assume the roles currently played by thousands of ad-hoc routines and data formats.
Such standardization to a relational-like model brings many benefits:
Returns a procedure implementing a relational database using the base-table-implementation.
All of the operations of a base table implementation are accessed
through a procedure defined by require
ing that implementation.
Similarly, all of the operations of the relational database
implementation are accessed through the procedure returned by
make-relational-system
. For instance, a new relational database
could be created from the procedure returned by
make-relational-system
by:
(require 'alist-table) (define relational-alist-system (make-relational-system alist-table)) (define create-alist-database (relational-alist-system 'create-database)) (define my-database (create-alist-database "mydata.db"))
What follows are the descriptions of the methods available from
relational system returned by a call to make-relational-system
.
Returns an open, nearly empty relational database associated with
filename. The only tables defined are the system catalog and
domain table. Calling the close-database
method on this database
and possibly other operations will cause filename to be written
to. If filename is #f
a temporary, non-disk based database
will be created if such can be supported by the underlying base table
implelentation. If the database cannot be created as specified
#f
is returned. For the fields and layout of descriptor tables,
See section Catalog Representation
Returns an open relational database associated with filename. If
mutable? is #t
, this database will have methods capable of
effecting change to the database. If mutable? is #f
, only
methods for inquiring the database will be available. Calling the
close-database
(and possibly other) method on a mutable?
database will cause filename to be written to. If the database
cannot be opened as specified #f
is returned.
These are the descriptions of the methods available from an open relational database. A method is retrieved from a database by calling the database with the symbol name of the operation. For example:
(define my-database (create-alist-database "mydata.db")) (define telephone-table-desc ((my-database 'create-table) 'telephone-table-desc))
#t
is returned. Otherwise, #f
is returned.
close-database
(and
possibly other) method on this database will cause filename to be
written to. If filename is #f
this database will be
changed to a temporary, non-disk based database if such can be supported
by the underlying base table implelentation. If the operations
completed successfully, #t
is returned. Otherwise, #f
is
returned.
#t
if table-name exists in the system catalog,
otherwise returns #f
.
#f
.
These methods will be present only in databases which are mutable?.
#f
otherwise.
#f
. For the fields and layout of descriptor tables,
See section Catalog Representation.
#f
.
These are the descriptions of the methods available from an open relational table. A method is retrieved from a table by calling the table with the symbol name of the operation. For example:
(define telephone-table-desc ((my-database 'create-table) 'telephone-table-desc)) (require 'common-list-functions) (define ndrp (telephone-table-desc 'row:insert)) (ndrp '(1 #t name #f string)) (ndrp '(2 #f telephone (lambda (d) (and (string? d) (> (string-length d) 2) (every (lambda (c) (memv c '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\+ #\( #\ #\) #\-))) (string->list d)))) string))
Some operations described below require primary key arguments. Primary keys arguments are denoted key1 key2 .... It is an error to call an operation for a table which takes primary key arguments with the wrong number of primary keys for that table.
The term row used below refers to a Scheme list of values (one for
each column) in the order specified in the descriptor (table) for this
table. Missing values appear as #f
. Primary keys must not
be missing.
#f
otherwise.
((plat 'get 'processor) 'djgpp) => i386 ((plat 'get 'processor) 'be-os) => #f
((plat 'get* 'processor)) => (i386 8086 i386 8086 i386 i386 8086 m68000 m68000 m68000 m68000 m68000 powerpc) ((plat 'get* 'processor) #f) => (i386 8086 i386 8086 i386 i386 8086 m68000 m68000 m68000 m68000 m68000 powerpc) (define (a-key? key) (char=? #\a (string-ref (symbol->string key) 0))) ((plat 'get* 'processor) a-key?) => (m68000 m68000 m68000 m68000 m68000 powerpc) ((plat 'get* 'name) a-key?) => (atari-st-turbo-c atari-st-gcc amiga-sas/c-5.10 amiga-aztec amiga-dice-c aix)
#f
otherwise.
((plat 'row:retrieve) 'linux) => (linux i386 linux gcc) ((plat 'row:retrieve) 'multics) => #f
((plat 'row:retrieve*) a-key?) => ((atari-st-turbo-c m68000 atari turbo-c) (atari-st-gcc m68000 atari gcc) (amiga-sas/c-5.10 m68000 amiga sas/c) (amiga-aztec m68000 amiga aztec) (amiga-dice-c m68000 amiga dice-c) (aix powerpc aix -))
#f
otherwise.
Real relational programmers would use some least-upper-bound join for every row to get them in order; But we don't have joins yet.
The (optional) match-key1 ... arguments are used to restrict
actions of a whole-table operation to a subset of that table. Those
procedures (returned by methods) which accept match-key arguments will
accept any number of match-key arguments between zero and the number of
primary keys in the table. Any unspecified match-key arguments
default to #f
.
The match-key1 ... restrict the actions of the table command to those records whose primary keys each satisfy the corresponding match-key argument. The arguments and their actions are:
#f
- The false value matches any key in the corresponding position.
- an object of type procedure
- This procedure must take a single argument, the key in the corresponding position. Any key for which the procedure returns a non-false value is a match; Any key for which the procedure returns a
#f
is not.- other values
- Any other value matches only those keys
equal?
to it.
Each database (in an implementation) has a system catalog which describes all the user accessible tables in that database (including itself).
The system catalog base table has the following fields. PRI
indicates a primary key for that table.
PRI table-name column-limit the highest column number coltab-name descriptor table name bastab-id data base table identifier user-integrity-rule view-procedure A scheme thunk which, when called, produces a handle for the view. coltab and bastab are specified if and only if view-procedure is not.
Descriptors for base tables (not views) are tables (pointed to by system catalog). Descriptor (base) tables have the fields:
PRI column-number sequential integers from 1 primary-key? boolean TRUE for primary key components column-name column-integrity-rule domain-name
A primary key is any column marked as primary-key?
in the
corresponding descriptor table. All the primary-key?
columns
must have lower column numbers than any non-primary-key?
columns.
Every table must have at least one primary key. Primary keys must be
sufficient to distinguish all rows from each other in the table. All of
the system defined tables have a single primary key.
This package currently supports tables having from 1 to 4 primary keys if there are non-primary columns, and any (natural) number if all columns are primary keys. If you need more than 4 primary keys, I would like to hear what you are doing!
A domain is a category describing the allowable values to occur in a column. It is described by a (base) table with the fields:
PRI domain-name foreign-table domain-integrity-rule type-id type-param
The type-id field value is a symbol. This symbol may be used by the underlying base table implementation in storing that field.
If the foreign-table
field is non-#f
then that field names
a table from the catalog. The values for that domain must match a
primary key of the table referenced by the type-param (or
#f
, if allowed). This package currently does not support
composite foreign-keys.
The types for which support is planned are:
atom symbol string [<length>] number [<base>] money <currency> date-time boolean foreign-key <table-name> expression virtual <expression>
Although `rdms.scm' is not large, I found it very difficult to write (six rewrites). I am not aware of any other examples of a generalized relational system (although there is little new in CS). I left out several aspects of the Relational model in order to simplify the job. The major features lacking (which might be addressed portably) are views, transaction boundaries, and protection.
Protection needs a model for specifying priveledges. Given how operations are accessed from handles it should not be difficult to restrict table accesses to those allowed for that user.
The system catalog has a field called view-procedure
. This
should allow a purely functional implementation of views. This will
work but is unsatisfying for views resulting from a selection
(subset of rows); for whole table operations it will not be possible to
reduce the number of keys scanned over when the selection is specified
only by an opaque procedure.
Transaction boundaries present the most intriguing area. Transaction boundaries are actually a feature of the "Comprehensive Language" of the Relational database and not of the database. Scheme would seem to provide the opportunity for an extremely clean semantics for transaction boundaries since the builtin procedures with side effects are small in number and easily identified.
These side-effect builtin procedures might all be portably redefined to versions which properly handled transactions. Compiled library routines would need to be recompiled as well. Many system extensions (delete-file, system, etc.) would also need to be redefined.
There are 2 scope issues that must be resolved for multiprocess transaction boundaries:
dynamic-wind
would
provide a workable hook into process switching for many implementations.
This enhancement wraps a utility layer on relational-database
which provides:
*commands*
table in database.
Also included are utilities which provide:
for any SLIB relational database.
*commands*
table)
relational database (with base-table type base-table-type)
associated with filename.
open-database
will attempt to deduce the correct
base-table-type. If the database can not be opened or if it lacks the
*commands*
table, #f
is returned.
The table *commands*
in an enhanced relational-database has
the fields (with domains):
PRI name symbol parameters parameter-list procedure expression documentation string
The parameters
field is a foreign key (domain
parameter-list
) of the *catalog-data*
table and should
have the value of a table described by *parameter-columns*
. This
parameter-list
table describes the arguments suitable for passing
to the associated command. The intent of this table is to be of a form
such that different user-interfaces (for instance, pull-down menus or
plain-text queries) can operate from the same table. A
parameter-list
table has the following fields:
PRI index uint name symbol arity parameter-arity domain domain defaulter expression expander expression documentation string
The arity
field can take the values:
single
optional
boolean
#f
is substituted) is acceptable.
nary
nary1
The domain
field specifies the domain which a parameter or
parameters in the index
th field must satisfy.
The defaulter
field is an expression whose value is either
#f
or a procedure of one argument (the parameter-list) which
returns a list of the default value or values as appropriate.
Note that since the defaulter
procedure is called every time a
default parameter is needed for this column, sticky defaults can
be implemented using shared state with the domain-integrity-rule.
When an enhanced relational-database is called with a symbol which
matches a name in the *commands*
table, the associated
procedure expression is evaluated and applied to the enhanced
relational-database. A procedure should then be returned which the user
can invoke on (optional) arguments.
The command *initialize*
is special. If present in the
*commands*
table, open-database
or open-database!
will return the value of the *initialize*
command. Notice that
arbitrary code can be run when the *initialize*
procedure is
automatically applied to the enhanced relational-database.
Note also that if you wish to shadow or hide from the user
relational-database methods described in section Relational Database Operations, this can be done by a dispatch in the closure returned by
the *initialize*
expression rather than by entries in the
*commands*
table if it is desired that the underlying methods
remain accessible to code in the *commands*
table.
name
field of the command's parameter-table.
index
field of the command's parameter-table.
arity
field of the command's parameter-table. For a
description of arity
see table above.
type-id
field of the contents of the domain
of the
command's parameter-table.
defaulters
field of the command's parameter-table.
nary
arity
parameters.
(alias parameter-name)
. There can be
more than one alias per parameter-name.
For information about parameters, See section Parameter lists. Here is an
example of setting up a command with arguments and parsing those
arguments from a getopt
style argument list (see section Getopt).
(require 'database-utilities) (require 'fluid-let) (require 'parameters) (require 'getopt) (define my-rdb (create-database #f 'alist-table)) (define-tables my-rdb '(foo-params *parameter-columns* *parameter-columns* ((1 single-string single string (lambda (pl) '("str")) #f "single string") (2 nary-symbols nary symbol (lambda (pl) '()) #f "zero or more symbols") (3 nary1-symbols nary1 symbol (lambda (pl) '(symb)) #f "one or more symbols") (4 optional-number optional uint (lambda (pl) '()) #f "zero or one number") (5 flag boolean boolean (lambda (pl) '(#f)) #f "a boolean flag"))) '(foo-pnames ((name string)) ((parameter-index uint)) (("s" 1) ("single-string" 1) ("n" 2) ("nary-symbols" 2) ("N" 3) ("nary1-symbols" 3) ("o" 4) ("optional-number" 4) ("f" 5) ("flag" 5))) '(my-commands ((name symbol)) ((parameters parameter-list) (parameter-names parameter-name-translation) (procedure expression) (documentation string)) ((foo foo-params foo-pnames (lambda (rdb) (lambda args (print args))) "test command arguments")))) (define (dbutil:serve-command-line rdb command-table command argc argv) (set! argv (if (vector? argv) (vector->list argv) argv)) ((make-command-server rdb command-table) command (lambda (comname comval options positions arities types defaulters dirs aliases) (apply comval (getopt->arglist argc argv options positions arities types defaulters dirs aliases))))) (define (cmd . opts) (fluid-let ((*optind* 1)) (printf "%-34s => " (call-with-output-string (lambda (pt) (write (cons 'cmd opts) pt)))) (set! opts (cons "cmd" opts)) (force-output) (dbutil:serve-command-line my-rdb 'my-commands 'foo (length opts) opts))) (cmd) => ("str" () (symb) () #f) (cmd "-f") => ("str" () (symb) () #t) (cmd "--flag") => ("str" () (symb) () #t) (cmd "-o177") => ("str" () (symb) (177) #f) (cmd "-o" "177") => ("str" () (symb) (177) #f) (cmd "--optional" "621") => ("str" () (symb) (621) #f) (cmd "--optional=621") => ("str" () (symb) (621) #f) (cmd "-s" "speciality") => ("speciality" () (symb) () #f) (cmd "-sspeciality") => ("speciality" () (symb) () #f) (cmd "--single" "serendipity") => ("serendipity" () (symb) () #f) (cmd "--single=serendipity") => ("serendipity" () (symb) () #f) (cmd "-n" "gravity" "piety") => ("str" () (piety gravity) () #f) (cmd "-ngravity" "piety") => ("str" () (piety gravity) () #f) (cmd "--nary" "chastity") => ("str" () (chastity) () #f) (cmd "--nary=chastity" "") => ("str" () ( chastity) () #f) (cmd "-N" "calamity") => ("str" () (calamity) () #f) (cmd "-Ncalamity") => ("str" () (calamity) () #f) (cmd "--nary1" "surety") => ("str" () (surety) () #f) (cmd "--nary1=surety") => ("str" () (surety) () #f) (cmd "-N" "levity" "fealty") => ("str" () (fealty levity) () #f) (cmd "-Nlevity" "fealty") => ("str" () (fealty levity) () #f) (cmd "--nary1" "surety" "brevity") => ("str" () (brevity surety) () #f) (cmd "--nary1=surety" "brevity") => ("str" () (brevity surety) () #f) (cmd "-?") -| Usage: cmd [OPTION ARGUMENT ...] ... -f, --flag -o, --optional[=]<number> -n, --nary[=]<symbols> ... -N, --nary1[=]<symbols> ... -s, --single[=]<string> ERROR: getopt->parameter-list "unrecognized option" "-?"
Some commands are defined in all extended relational-databases. The are called just like section Relational Database Operations.
(car domain-row)
and
returns #t
. Otherwise returns #f
.
For the fields and layout of the domain table, See section Catalog Representation. Currently, these fields are
The following example adds 3 domains to the `build' database.
`Optstring' is either a string or #f
. filename
is a
string and build-whats
is a symbol.
(for-each (build 'add-domain) '((optstring #f (lambda (x) (or (not x) (string? x))) string #f) (filename #f #f string #f) (build-whats #f #f symbol #f)))
(<name> <descriptor-name> <descriptor-name> <rows>)
or
(<name> <primary-key-fields> <other-fields> <rows>)
where <name> is the table name, <descriptor-name> is the symbol name of a descriptor table, <primary-key-fields> and <other-fields> describe the primary keys and other fields respectively, and <rows> is a list of data rows to be added to the table.
<primary-key-fields> and <other-fields> are lists of field descriptors of the form:
(<column-name> <domain>)
or
(<column-name> <domain> <column-integrity-rule>)
where <column-name> is the column name, <domain> is the domain
of the column, and <column-integrity-rule> is an expression whose
value is a procedure of one argument (which returns #f
to signal
an error).
If <domain> is not a defined domain name and it matches the name of this table or an already defined (in one of spec-0 ...) single key field table, a foriegn-key domain will be created for it.
The following example shows a new database with the name of `foo.db' being created with tables describing processor families and processor/os/compiler combinations.
The database command define-tables
is defined to call
define-tables
with its arguments. The database is also
configured to print `Welcome' when the database is opened. The
database is then closed and reopened.
(require 'database-utilities) (define my-rdb (create-database "foo.db" 'alist-table)) (define-tables my-rdb '(*commands* ((name symbol)) ((parameters parameter-list) (procedure expression) (documentation string)) ((define-tables no-parameters no-parameter-names (lambda (rdb) (lambda specs (apply define-tables rdb specs))) "Create or Augment tables from list of specs") (*initialize* no-parameters no-parameter-names (lambda (rdb) (display "Welcome") (newline) rdb) "Print Welcome")))) ((my-rdb 'define-tables) '(processor-family ((family atom)) ((also-ran processor-family)) ((m68000 #f) (m68030 m68000) (i386 8086) (8086 #f) (powerpc #f))) '(platform ((name symbol)) ((processor processor-family) (os symbol) (compiler symbol)) ((aix powerpc aix -) (amiga-dice-c m68000 amiga dice-c) (amiga-aztec m68000 amiga aztec) (amiga-sas/c-5.10 m68000 amiga sas/c) (atari-st-gcc m68000 atari gcc) (atari-st-turbo-c m68000 atari turbo-c) (borland-c-3.1 8086 ms-dos borland-c) (djgpp i386 ms-dos gcc) (linux i386 linux gcc) (microsoft-c 8086 ms-dos microsoft-c) (os/2-emx i386 os/2 gcc) (turbo-c-2 8086 ms-dos turbo-c) (watcom-9.0 i386 ms-dos watcom)))) ((my-rdb 'close-database)) (set! my-rdb (open-database "foo.db" 'alist-table)) -| Welcome
Code for generating database reports is in `report.scm'. After
writing it using format
, I discovered that Common-Lisp
format
is not useable for this application because there is no
mechanismm for truncating fields. `report.scm' needs to be
rewritten using printf
.
*reports*
in the relational database rdb.
destination is a port, string, or symbol. If destination is
a:
Each row in the table *reports* has the fields:
format
string. At the beginning and end of each page
respectively, format
is called with this string and the (list of)
column-names of this table.
format
string. For each row in the table, format
is
called with this string and the row.
0
if a row's lines should not be broken over page
boundaries.
Each row in the table *printers* has the fields:
The report is prepared as follows:
Format
(see section Format (version 3.0)) is called with the header
field
and the (list of) column-names
of the table.
Format
is called with the reporter
field and (on
successive calls) each record in the natural order for the table. A
count is kept of the number of newlines output by format. When the
number of newlines to be output exceeds the number of lines per page,
the set of lines will be broken if there are more than
minimum-break
left on this page and the number of lines for this
row is larger or equal to twice minimum-break
.
Format
is called with the footer
field and the (list of)
column-names
of the table. The footer field should not output a
newline.
(require 'database-browse)
Prints the names of all the tables in database and sets browse's default to database.
Prints the names of all the tables in the default database.
For each record of the table named by the symbol table-name, prints a line composed of all the field values.
Opens the database named by the string pathname, prints the names of all its tables, and sets browse's default to the database.
Sets browse's default to database and prints the records of the table named by the symbol table-name.
Opens the database named by the string pathname and sets browse's
default to it; browse
prints the records of the table named by
the symbol table-name.
Balanced binary trees are a useful data structure for maintaining large sets of ordered objects or sets of associations whose keys are ordered. MIT Scheme has an comprehensive implementation of weight-balanced binary trees which has several advantages over the other data structures for large aggregates:
(+ 1 x)
modifies neither the constant 1 nor the value bound to x
. The
trees are referentially transparent thus the programmer need not worry
about copying the trees. Referential transparency allows space
efficiency to be achieved by sharing subtrees.
These features make weight-balanced trees suitable for a wide range of applications, especially those that require large numbers of sets or discrete maps. Applications that have a few global databases and/or concentrate on element-level operations like insertion and lookup are probably better off using hash-tables or red-black trees.
The size of a tree is the number of associations that it contains. Weight balanced binary trees are balanced to keep the sizes of the subtrees of each node within a constant factor of each other. This ensures logarithmic times for single-path operations (like lookup and insertion). A weight balanced tree takes space that is proportional to the number of associations in the tree. For the current implementation, the constant of proportionality is six words per association.
Weight balanced trees can be used as an implementation for either
discrete sets or discrete maps (associations). Sets are implemented by
ignoring the datum that is associated with the key. Under this scheme
if an associations exists in the tree this indicates that the key of the
association is a member of the set. Typically a value such as
()
, #t
or #f
is associated with the key.
Many operations can be viewed as computing a result that, depending on
whether the tree arguments are thought of as sets or maps, is known by
two different names. An example is wt-tree/member?
, which, when
regarding the tree argument as a set, computes the set membership
operation, but, when regarding the tree as a discrete map,
wt-tree/member?
is the predicate testing if the map is defined at
an element in its domain. Most names in this package have been chosen
based on interpreting the trees as sets, hence the name
wt-tree/member?
rather than wt-tree/defined-at?
.
The weight balanced tree implementation is a run-time-loadable option. To use weight balanced trees, execute
(load-option 'wt-tree)
once before calling any of the procedures defined here.
Binary trees require there to be a total order on the keys used to arrange the elements in the tree. Weight balanced trees are organized by types, where the type is an object encapsulating the ordering relation. Creating a tree is a two-stage process. First a tree type must be created from the predicate which gives the ordering. The tree type is then used for making trees, either empty or singleton trees or trees from other aggregate structures like association lists. Once created, a tree `knows' its type and the type is used to test compatibility between trees in operations taking two trees. Usually a small number of tree types are created at the beginning of a program and used many times throughout the program's execution.
a
, b
and c
:
(key<? a a) => #f (and (key<? a b) (key<? b a)) => #f (if (and (key<? a b) (key<? b c)) (key<? a c) #t) => #t
Two key values are assumed to be equal if neither is less than the other by key<?.
Each call to make-wt-tree-type
returns a distinct value, and
trees are only compatible if their tree types are eq?
. A
consequence is that trees that are intended to be used in binary tree
operations must all be created with a tree type originating from the
same call to make-wt-tree-type
.
Number-wt-type
could have been defined by
(define number-wt-type (make-wt-tree-type <))
String-wt-type
could have been defined by
(define string-wt-type (make-wt-tree-type string<?))
make-wt-tree-type
; the returned tree has this type.
make-wt-tree-type
; the returned tree has this type.
(lambda (type alist) (let ((tree (make-wt-tree type))) (for-each (lambda (association) (wt-tree/add! tree (car association) (cdr association))) alist) tree))
This section describes the basic tree operations on weight balanced trees. These operations are the usual tree operations for insertion, deletion and lookup, some predicates and a procedure for determining the number of associations in a tree.
#t
if object is a weight-balanced tree, otherwise
returns #f
.
#t
if wt-tree contains no associations, otherwise
returns #f
.
#t
if wt-tree contains an association for
key, otherwise returns #f
. The average and worst-case
times required by this operation are proportional to the logarithm of
the number of associations in wt-tree.
In the following the size of a tree is the number of associations that the tree contains, and a smaller tree contains fewer associations.
wt-tree/union
computes the map override of wt-tree-1 by
wt-tree-2. If the trees are viewed as sets the result is the set
union of the arguments.
The worst-case time required by this operation
is proportional to the sum of the sizes of both trees.
If the minimum key of one tree is greater than the maximum key of
the other tree then the time required is at worst proportional to
the logarithm of the size of the larger tree.
wt-tree/intersection
computes the domain restriction of
wt-tree-1 to (the domain of) wt-tree-2.
The time required by this operation is never worse that proportional to
the sum of the sizes of the trees.
#t
iff the key of each association in wt-tree-1 is
the key of some association in wt-tree-2, otherwise returns #f
.
Viewed as a set operation, wt-tree/subset?
is the improper subset
predicate.
A proper subset predicate can be constructed:
(define (proper-subset? s1 s2) (and (wt-tree/subset? s1 s2) (< (wt-tree/size s1) (wt-tree/size s2))))
As a discrete map operation, wt-tree/subset?
is the subset
test on the domain(s) of the map(s). In the worst-case the time
required by this operation is proportional to the size of
wt-tree-1.
#t
iff for every association in wt-tree-1 there is
an association in wt-tree-2 that has the same key, and vice
versa.
Viewing the arguments as sets wt-tree/set-equal?
is the set
equality predicate. As a map operation it determines if two maps are
defined on the same domain.
This procedure is equivalent to
(lambda (wt-tree-1 wt-tree-2) (and (wt-tree/subset? wt-tree-1 wt-tree-2 (wt-tree/subset? wt-tree-2 wt-tree-1)))
In the worst-case the time required by this operation is proportional to the size of the smaller tree.
wt-tree/fold
takes time
proportional to the size of wt-tree.
A sorted association list can be derived simply:
(wt-tree/fold (lambda (key datum list) (cons (cons key datum) list)) '() wt-tree))
The data in the associations can be summed like this:
(wt-tree/fold (lambda (key datum sum) (+ sum datum)) 0 wt-tree)
wt-tree/for-each
takes time proportional to in the size of
wt-tree.
The example prints the tree:
(wt-tree/for-each (lambda (key value) (display (list key value))) wt-tree))
Weight balanced trees support operations that view the tree as sorted sequence of associations. Elements of the sequence can be accessed by position, and the position of an element in the sequence can be determined, both in logarthmic time.
wt-tree/index
returns the indexth key,
wt-tree/index-datum
returns the datum associated with the
indexth key and wt-tree/index-pair
returns a new pair
(key . datum)
which is the cons
of the
indexth key and its datum. The average and worst-case times
required by this operation are proportional to the logarithm of the
number of associations in the tree.
These operations signal an error if the tree is empty, if
index<0
, or if index is greater than or equal to the
number of associations in the tree.
Indexing can be used to find the median and maximum keys in the tree as follows:
median: (wt-tree/index wt-tree (quotient (wt-tree/size wt-tree) 2)) maximum: (wt-tree/index wt-tree (-1+ (wt-tree/size wt-tree)))
#f
if the tree
has no association with for key. This procedure returns either an
exact non-negative integer or #f
. The average and worst-case
times required by this operation are proportional to the logarithm of
the number of associations in the tree.
wt-tree/min
returns the least key,
wt-tree/min-datum
returns the datum associated with the least key
and wt-tree/min-pair
returns a new pair (key . datum)
which is the cons
of the minimum key and its datum. The average
and worst-case times required by this operation are proportional to the
logarithm of the number of associations in the tree.
These operations signal an error if the tree is empty. They could be written
(define (wt-tree/min tree) (wt-tree/index tree 0)) (define (wt-tree/min-datum tree) (wt-tree/index-datum tree 0)) (define (wt-tree/min-pair tree) (wt-tree/index-pair tree 0))
(wt-tree/delete wt-tree (wt-tree/min wt-tree))
(wt-tree/delete! wt-tree (wt-tree/min wt-tree))
Go to the first, previous, next, last section, table of contents.