CYPRESS DOCUMENTATION
CYPRESS DOCUMENTATION
CYPRESS DOCUMENTATION
Cypress Database System Documentation
R. G. G. Cattell
Release as    [Indigo]<CedarŽ>Documentation>CypressDoc.Tioga
Written by    R. G. G. Cattell, June 13, 1983
Last edited    by Beach, February 9, 1984 5:40:26 pm PST
    by Jim Donahue, February 10, 1984 12:16:55 pm PST
Abstract: This document describes the Cypress system, and is aimed at potential writers of database applications in the Cedar 
programming environment.  It should be accurate as of the date above, and is recommended as better documentation than CSL Report 
83-4, from which it was derived.  Suggestions are welcomed on both the design of Cypress and its exposition in this document.  We 
will assume little knowledge of database systems, and little knowledge of Cedar.  We will not explain the motivation for this 
particular design of Cypress; see the CSL Report for that.  You should also consult the documentation for database tools not 
described here; see a database wizard for details.
XEROX            Xerox Corporation
                Palo Alto Research Center
                3333 Coyote Hill Road
                Palo Alto, California 94304

DRAFT  For Internal Xerox Use Only  DRAFT
TABLE OF CONTENTS
1.   Introduction   1
2.   Cypress data model concepts   3
2.1   Data independence
2.2   Basic primitives
2.3   Names and keys
2.4   Basic operations
2.5   Aggregate operations
2.6   Convenience operations
2.7   Normalization
2.8   Segments
3.   Model level interface   19
3.1   Types
3.2   Transactions and segments
3.3   Data schema definition
3.4   Basic operations
3.5   Query operations
3.6   System domains and relations
3.7   Errors
4.  Application example   39
4.1   A database application
4.2   Schema design
4.3   Example program
1.  Introduction
INTRODUCTION
INTRODUCTION
Cypress was motivated by the need for database management facilities in Cedar.  Its goals are (1) convenience, making it easy to 
build new database applications in Cedar, and (2) reasonable performance for a wide class of applications.  If you are unsure 
whether an application you have in mind is appropriate, talk to someone who has built one.  You should also consult others to see 
what auxiliary documentation and folklore is available.
The conceptual data structure provided by a database system, analogous to the Mesa type system, is called a data model.  The data 
model specifies how data may be structured and accessed; thus the data model is what you need to know in order to use the system.  
The Cypress data model described herein includes features that are accepted in some form in a number of models in the literature, 
particularly those that seemed most useful for database applications we envision in CSL.
The data model is described conceptually in Section 2.  In Section 3 we describe the Cedar interface to Cypress:  DB.mesa.  
Finally, Section 4 provides a simple example of Cypress use. We cover only the abstract model and the DB interface, not its 
implementation or rationale.  The reader interested in these should consult CSL Report 83-4.  A set of formal axioms for the model 
can also be found there.
Clients interested in tools for examining or modifying databases, either as an end-user or for debugging an application program, 
should consult the database tools documentation: [Indigo]<CedarŽ>Documentation>SquirrelDoc.tioga and .press, where Ž refers to the 
current cedar release version number such as 5.1.
2. Cypress data model concepts
CYPRESS DATA MODEL CONCEPTS
CYPRESS DATA MODEL CONCEPTS
In this section, we give an informal description of the Cypress data model.  We describe the particulars of the Cedar interface in 
Section 3. 
2.1 Data independence
We deal here with the conceptual data model: the logical primitives for data access and data type definition.  This should be 
carefully distinguished from the physical data storage and access mechansisms.  The physical representation of data is hidden as 
much as possible from the database client to facilitate data independence, the guarantee that a user's program will continue to 
work (perhaps with a change in efficiency) even though the physical data representation is redesigned. 
For any particular database using our conceptual data model, the actual specification of the types of data in the database, using 
the primitives the model provides, is termed the data schema.  Note that a mapping must be provided between the conceptual data 
model and the physical representation, either automatically or with further instruction from the client; we will do some of both.  
The logical to physical mapping is intimately associated with the performance of the database system as viewed by the user 
performing operations at the conceptual level.
2.2 Basic primitives
Three basic primitives are defined in the model:  an entity, datum, and relationship.
An entity represents an abstract or concrete object in the world:  a person, an organization, a document, a product, an event.  In 
programming languages and knowledge representation entities have variously been referred to as atoms, symbols, and nodes.  A datum, 
unlike an entity, represents literal information such as times, weights, part names, or phone numbers.  Character strings and 
integers are possible datum types.
It is a policy decision whether something is represented as an entity or merely a datum:  e.g., an employee's spouse may be 
represented in a database system as a datum (the spouse's name), or the spouse may be an entity in itself.  The database system 
provides a higher level of logical integrity checking for entities than for datum values, as we will see later:  unique entity 
identifiers, checks on entity types, and removal of dependent data upon entity deletion.  We shall discuss the entity/datum choice 
further in Section 4.2.
We will use the term value to refer to something that can be either a datum or an entity.  In many programming languages, there is 
no reason to distinguish entity values from datum values.  Indeed, most of the Cypress operations deal with any kind of value, and 
some make it transparent to the caller whether an entity or datum value is involved.  The transparent case makes relational 
operations possible in our model, as we will see in Section 2.5.
A relationship is a tuple whose elements are [entity or datum] values.  We refer to the elements (fields) of relationships by name 
instead of position.  These names for positions are called attributes.
Note that we have separated the representatives of unique objects (entities) from the representation of information about objects 
(relationships), unlike some object-oriented programming languages and data models.  Therefore an entity is not an "object" (or 
"record") in the programming language sense, although entities are representatives of real-world objects. 
We also define entity types, datum types, and relationship types.  These are called domains, datatypes, and relations, 
respectively.  We make use of these three types through one fundamental type constraint:   every relationship in a relation has the 
same attributes, and the values associated with each attribute must be from a pre-specified domain or datatype.  One might think of 
a relation as a "record type" in a programming language, although relations permit more powerful operations than record types.
As an example, consider a member relation that specifies that a given person is a member of a given organization with a given job 
title, as in the following figure.  The person and organization might be entities, while the title might be a string datum.  We 
relax the fundamental type constraint somewhat in allowing a lattice of types of domains:  a particular value may then belong to 
the pre-specified domain or one of its sub-domains.  For example, one could be a member of a University, a Company, or any other 
type of Organization one chooses to define.  Other relations, e.g. an "offers-course" relation, might apply only to a University.
CypressFig1.press  leftMargin: 1 in,  topMargin: 1 in,  width: 2.125 in,  height: .75 in
Relationships represent facts about the world.   A relation (a set of relationships) represents a kind of relationship in which 
entities and values can participate.  We will schematically represent that John Smith is a member of Acme company with title 
"manager" by drawing lines for the relationship (labelled with the relation and its attributes), a circle for each entity, and a 
box for the datum:
CypressFig2.press  leftMargin: 1 in,  topMargin: 1 in,  width: 2.125 in,  height: .75 in
One should normally think of relations as types, not sets.  A relation can be defined as the set of all relationships of its type, 
however, and thus can be treated as a relation in the normal mathematical sense.  Note that we differ from mathematical convention 
in another minor point:  the use of attributes to refer to tuple elements by name instead of position.  Reference by position is 
therefore not necessary and is in fact not permitted.  We can often omit attribute names without ambiguity since the types of 
participating entities imply their role in a relationship.  However, they are necessary in the general case; e.g., a boss relation 
between a person and a person requires the directionality to define its semantics.
We can summarize the six basic primitives of the data model in tabular form.  Familiarity with these six terms is essential to 
understanding the remainder of this document:
Type    Instance        Instance Represents 
Domain    Entity        physical or abstract object
Datatype    Datum        numerical measurement or symbolic tag
Relation    Relationship        logical correspondence among objects and values
Our terminology might be more consistent if we called a domain an "entity type," and a relation a "relationship type."  Instead we 
have compromised on the terms most widely used in the literature for all six of the basic concepts.  The reader will find the 
remainder of this document much more understandable if these six terms are commited to memory before proceeding.
2.3 Names and keys
The data model also provides primitives to specify uniqueness of relationships and entities.  For relationships, this is done with 
keys, and for entities, with names.  A key for a relation is an attribute or set of attributes whose value uniquely identifies a 
relationship in the relation.  No two relationships may have the same values for a key attribute or attribute set (a relation may 
have more than one key, in which case relationships must have unique values for all keys).  A name acts similarly for a domain.  
The name of an entity must uniquely specify it in its domain.
Consider a spouse relation, a binary relation between a person and a person.  Both of its attributes are keys, because each person 
may have only one spouse (if we choose to define spouse this way!).  For an author relation, neither the person nor the document 
are keys, since a person may author more than one document and a document may have more than one author.  The person and document 
attributes together would comprise a key, however:  there should be only one tuple for a particular person-document pair.
CypressFig3.press  leftMargin: 1 in,  topMargin: 1 in,  width: 2.125 in,  height: .75 in
<==<MLDFig2.3.press<
We have labelled entities with names in the figures.  Names are most useful when they are human-sensible tags for the entities, 
e.g. the title for a document, or the name of a person.  However, their primary function is as a unique entity identifier, so 
non-unique human-sensible names must be represented as relations.  If entities of a domain have more than one unique identifier, 
e.g. social security numbers and employee numbers, then one identifier must be chosen as the domain's entity names and the other 
represented as a relation connecting the entities with the unique alternate identifier (a key of that relation).
We require that every entity have a unique name, although the name may automatically be generated by the database system.  Thus 
every entity may be uniquely identified by the pair:
[domain name, entity name]. 
Some authors in the database literature use the term entity to refer to a real-world object rather than its representation in the 
database system.  When we use the term entity, we refer to the internal entity, the entity "handle" returned by database operations 
and stored in entity-valued variables, not the external entity that the internal entity represents or the entity identifier 
[domain, name] that may be used to uniquely identify an internal entity.  The three are interchangeable, however, since they must 
always be in one-to-one correspondence.  
The reader may find it simple to think of entity-valued attributes of relationships as pointers to the entities, in fact 
bi-directional pointers, since the operations we provide allow access in either direction.  This is a useful analogy.  However, 
there is no constraint that the model be implemented with pointers, and the relationships of a relation could equally well be 
conceptualized as rows of a table whose columns are the attributes of the relation and whose entries for entity-valued attributes 
are the string names of the entities.  For example, the author relationships in the previous figure could also be displayed in the 
form:
Author:
Person    Book
George    Backgammon for Beginners
John    Backgammon for Beginners
John    Advanced Backgammon
Tom    Advanced Backgammon
Thus our introduction of entities to the Relational model does not entail a different representation than, say, a Network model 
might imply, but simply additional integrity checks on entity names and types, and new operations upon entities.  This 
compatibility with the Relational data model is important, as it allows the application of the powerful Relational calculus or 
algebra as a query language.  We return to query languages in Section 2.5.
Note that the only information about an entity associated directly with the entity is the name; this contrasts with most other data 
models.  A person's age or spouse, for example, would be represented in the Cypress data model via age or spouse relations.  Thus 
the relationships in a database are the information-carrying elements:  entities do not have attributes.  However the model provide 
an abbreviation, properties, to access information such as age or spouse in a single operation.  We will discuss properties later.  
In addition, the physical data modelling algorithms can store these data directly with the person entity as a result of the 
relation key information (since a person can have only one age, a field can be allocated for that field in the stored object 
representing a person.)
2.4 Basic operations
The data model provides the capability to define and examine the data schema, and perform operations on entities, relationships, 
and aggregates of entities and relationships.  In this section we discuss the basic operations on entities and relationships.  In 
Section 2.5, we discuss the operations on aggregate types, i.e. domains and relations.  We defer to Section 2.6 the discussion of 
"convenience" operations built upon the basic and aggregate operations.
Four basic operations are defined for entities:
1.   DeclareEntity[domain, name]:  Returns a new or existing entity in a domain.  An entity name must be specified.
2.   DestroyEntity[entity]:  Destroys an entity; this also destroys all relationships that refer to it.
3.   DomainOf[entity]:  Returns the domain of an entity (its type).
4.   NameOf[entity]:  Returns the string name of an entity.
Five basic operations are defined for relationships:
1.   DeclareRelationship[relation, list of attribute values]:  Returns a relationship with the given attribute values in the given 
relation.
2.   DestroyRelationship[relationship]:  Destroys a relationship.
3.   RelationOf[relationship]:  Returns a relationship's relation.
4.   GetF[relationship, attribute]:  Returns the value associated with the given attribute of the given relationship.
5.   SetF[relationship, attribute, value]:  Sets the value of the given relationship attribute.
The operations upon relationships recognize a specially-distinguished undefined value for an attribute.  Unassigned attributes of a 
newly-created relationship have this value.  A client of the data model may retrieve a value with GetF and test whether it equals 
the distinguished undefined value, and may set a previously defined value to be the distinguished undefined value with SetF.
Other "convenience" operations are built on top of the basic operations on entities and relationships:  properties and translucent 
attributes.  They are described in Section 2.6.  Although these operations are not essential to the basis of the Cypress model, 
they do furnish a fundamentally different perspective on the model.  They provide a mechanism to associate information directly 
with entities (instead of through relationships) and to write programs largely independent of attribute types.
The reader will also note that we have ignored issues of concurrent access and protection in the basic operations.  We will see 
later that an underlying transaction, file, and protection is associated with the relation and domain handles used in the basic 
operations.  This convenience allows us to treat concurrency, protection, and data location orthogonally.
2.5 Aggregate operations
There are two kinds of operations upon domains and relations, the aggregate types in our model:  the definition of domains and 
relations, and queries on domains and relations.  We first discuss their definitions.
Schema definition
As in other database models and a few programming languages, the Cypress model is self-representing:  the data schema is stored and 
accessible as data.  Thus application-independent tools can be written without coded-in knowledge of the types of data and their 
relationships.
Client-defined domains, relations, and attributes are represented by entities.  These entities belong to special built-in domains, 
called the system domains:
the Domain domain, with one element (entity) per domain
the Attribute domain, with one element per attribute
the Relation domain, with one element per relation
There is also a predefined Datatype domain, with pre-defined elements RopeType, BoolType, and IntType, called built-in types.  We 
do not allow client-defined datum types at present.
Information about domains, relations, and attributes are represented by system relations in which the system entities participate.  
The pre-defined SubType relation is a binary relation between domains and their subdomains.  There are also predefined binary 
relations that map attributes to information about the attributes:
aRelation:  maps an attribute entity to its relation entity.
aType:  maps an attribute to its type entity (a domain or a built-in type)
aUniqueness:  maps an attribute entity to {TRUE, FALSE}, depending whether it is part of a key of its relation.  We are assuming 
only one key per relation, here; our implementation relaxes this assumption in the case of single-attribute keys.  
The following diagram graphically illustrates a segment of a data schema describing the member relation and several domains.  The 
left side of the figure shows two subdomains of Organization, (Company and University), and the right shows the types and 
uniqueness properties of the member relation's attributes memberOf, memberIs, and memberAs.
CypressFig4.press  leftMargin: 1 in,  topMargin: 1 in,  width: 2.125 in,  height: .75 in
New domains, relations, and attributes are defined by creating entities and relationships in these pre-defined system domains and 
relations.  However, our implementation provides special operations to define the data schema, to simplify error checking.  These 
operations are:
DeclareDomain[name], and for each subtype relationship:
DeclareSubType[superdomain, subdomain]
DeclareRelation[name], and for each attribute:
DeclareAttribute[name, relation, type, uniqueness].
Queries
The operation RelationSubset[relation, attribute value list] enumerates relationships in a relation satisfying specified equality 
constraints on entity-valued attributes and/or range constraints on datum-valued attributes.  For example, RelationSubset might 
enumerate all of the relationships that reference a particular entity in one attribute and have an integer in the range 23 to 52 in 
another.
The operation DomainSubset[domain, name range] enumerates entities in a domain.  The enumeration may optionally be sorted by entity 
name, or restricted to a subset of the entities with a name in a given range.
More complex queries can be implemented in terms of DomainSubset and RelationSubset.  A future implementation will provide a 
MultiRelationSubset operation to efficiently enumerate single queries spanning more than one relation.  MultiRelationSubset 
operates upon a parsed representation of the query language, and produces the same kind of enumeration as RelationSubset.  See 
CSL-83-4 for more details. 
2.6 Convenience operations
Some more convenient specialized operations are built upon the basic operations described in the previous two sections.  They 
implement what we call properties and translucent attributes.  Although theoretically speaking these operations add no power to the 
model, they permit a significantly different perspective on the data access and so should be thought of as part of the model.
Properties
Properties allow the client to treat entities as if they, like relationships, had "attributes."  They provide the convenience of 
treating attributes of relationships that reference an entity as if they were attributes (or properties) of the entity itself.  The 
property operations are: 
1.   GetPList[entity, attribute1, attribute2]:  Attribute1 and attribute2 must be from the same relation.  Returns the values of 
attribute1 for all relationships in the relation that reference the entity via attribute2.  Attribute2 may be omitted, in which 
case it is assumed to be the only other entity-valued attribute of the relation.
2.   GetP[entity, attribute1, attribute2]: this is identical to GetPList except exactly one relationship must reference the entity 
via attribute2; otherwise an error is generated.  GetP always returns one value.
3.   SetPList[entity, attribute1, value list, attribute2]:  Attribute1 and attribute2 must be from the same relation.  Destroys any 
existing relationships whose attribute2 equals the entity, and creates new ones for each value in the list, with attribute1 equal 
to the value, and attribute2 equal to the entity.  Attribute2 can be defaulted as in GetPList.
4.   SetP[entity, attribute1, value, attribute2]:  this is identical to SetPList except it simply adds a new relationship 
referencing the entity instead of destroying any existing ones (unless attribute1 is a key of its relation, in which case the 
existing one must be replaced).
Thus the property operations allow information specified through relationships to be treated as properties of the entity itself, in 
single operations.  The property operations and the operations defined in earlier sections may be used interchangeably, as there is 
only one underlying representation of information:  the relationships.  As an example of the use of properties, consider the 
following database:
CypressFig5.press  leftMargin: 1 in,  topMargin: 1 in,  width: 2.125 in,  height: .75 in
The figure shows the entity John Smith, and three relationships in which he participates:  an age relationship and two member 
relationships,  The member relationships are ternary, the age relationship binary.  On this database, the property operations work 
as follows:
GetPList[John Smith, memberOf] would return the set {Acme Company, State University}.
GetP[John Smith, ageIs] (or GetPList) would return 34. 
SetP[John Smith, memberOf, Foo Family] would create a new member relationship specifying John to be a member of the Foo Family.  
SetPList would do the same, but would destroy the two existing member relationships referencing John.  In either case, the memberAs 
attribute would be left undefined in the new relationship. 
SetP[John Smith, ageIs, 35], where ageOf is a key of the age relation, would delete the relationship specifying John's age to be 
34, and insert a new one specifying John's age to be 35.  Note that SetP acted differently than on the member relation because 
memberIs is not a key. 
Again, the property operations are simply a convenience, although they provide a different perspective on the data model by 
allowing an entity-based view of a database. 
Translucent attributes
Some database application programs may not wish to be concerned with whether an attribute is entity-valued, string-valued, or 
integer-valued.  They might prefer to have all values mapped to some common denominator, e.g. a string.  An example would be a 
program that is simply displaying tuples on the screen.
Another class of applications would like to be independent of whether a particular attribute is represented as an entity or datum 
value.  Consider the member relation in the previous figure.  If we choose to define an Organization domain, then the memberOf 
attribute is entity-valued; but instead we might choose to make the memberOf attribute be string-valued, merely giving the name of 
the organization without defining organizations as entities.  This might be appropriate, for example, if we did not wish to invoke 
the type checking on uniqueness of names and the correctness of entity types.  We would like to write programs that are independent 
of whether an attribute is string-valued or entity-valued (as in the Relational data model).
We introduce translucent attributes to avoid dependence on attribute types.  Any attribute may be treated as a translucent 
attribute, by using the GetFS and SetFS operations to retrieve or assign its value.
GetFS[relationship, attribute] is identical to the GetF operation, except it returns a string regardless of the attribute's type.  
If the attribute is datum-valued, e.g. an integer or boolean, it is converted to a string equivalent.  If the attribute is 
entity-valued, the name of the entity is returned.
SetFS[relationship, attribute, value] performs the inverse mapping.  If the attribute is datum-valued, e.g. an integer or boolean, 
a string equivalent is accepted.  If the attribute is entity-valued, the name of the entity is passed to SetFS.  If an entity with 
the given name does not exist in the domain that is the attribute's type, then one is automatically created.
Changing entity names
Another convenience operation is provided on entities to change an entity's name:  ChangeName[entity, new name].  This operation is 
semantically equivalent to destroying the given entity and creating a new one with the new name, participating in the same 
relationships that the old one did.  See the description of ChangeName in Section 3.4 for precise semantics in our implementation, 
however. 
2.7 Normalization
A few comments on relational normalization are included here for users who must do their own data schema design.  Others may skip 
to the next section.
A relation is normalized by breaking it into two or more relations of lower order (fewer attributes) to eliminate undesirable 
dependencies between the attributes.  For example, one could define a "publication" relation with three attributes:
Publication:
Person    Book    Date
George    Backgammon for Beginners    1978
John    Backgammon for Beginners    1978
Mary    How to Play Chess    1981
Mary    How to Cheat at Chess    1982
This relation represents the fact that John and George wrote a book together entitled "Backgammon for Beginners," published in 
1978, and Mary wrote two books on the subject of chess, in 1981 and 1982.  Alternatively, we could encode the same information in 
two relations, an author relation and a publication-date relation:
Author:
Person    Book
George    Backgammon for Beginners
John    Backgammon for Beginners
Mary    How to Play Chess
Mary    How to Cheat at Chess
Publication-date:
Book            Date
Backgammon for Beginners    1978
How to Play Chess        1981
How to Cheat at Chess    1982
Although the second two relations may seem more verbose than the first one, they are actually representationally better in some 
sense, because the publication dates of books are not represented redundantly.  If one wants to change the publication date of 
"Backgrammon for Beginners" to 1979, for example, it need only be changed in one place in the publication-date relation but in two 
places in the publication relation.  If the date were changed in only one place in the publication relation, the database would 
become inconsistent.  This kind of behavior is called an update anomaly.  The second two relations are said to be a normalized form 
(as it happens, third normal form) of the first relation, and thereby avoid this particular kind of update anomaly. 
Relational normalization is not strictly part of the Cypress data model.  However the model's operations (and the tools we will 
develop in the implementation) encourage what we will call functionally irreducible form, in which relations are of the smallest 
order that is naturally meaningful. 
A relation is in irreducible form if it is of the smallest order possible without introducing new artificial domain(s) not 
otherwise desired (all relations can be reduced to binary by introducing artificial domains).  We will allow a slight weakening of 
irreducible form, functionally irreducible form, which permits combining two or more irreducible relations only when their 
semantics are mutually dependent (and therefore all present or absent in our world representation).  For example, a birthday 
relation between a person, month, day, and year can be combined instead of using three relations.  Another example would be an 
address relation between a person, street, city, and zip code.  Combining an age and phone relation would not result in 
functionally irreducible form, however, as their semantics are not mutually dependent. 
The functionally irreducible relations seen by the user are independent of the physical representation chosen by the system for 
efficiency, so we are concerned only with the logical data access.  Note that in addition to avoiding update anomalies, 
functionally irreducible form provides a one-to-one correspondence between the relationships in the database and the atomic facts 
they represent, a canonical form that is in some sense more natural than any other form. 
2.8 Segments
We would like a mechanism to divide up large databases, to provide different perspectives or subsets of the data to different users 
or application programs.  In this section we discuss a mechanism to provide this separation:  segments.  A segment is a set of 
entities and relationships that a database client chooses to treat as one logical and physical part of a database.
In introducing segments, we will slightly change the definition of an entity, previously defined to be uniquely determined by its 
domain and name.  We will treat entities with the same name and domain in different segments as different entities, although they 
may represent the same external entity.  The unique identifier of an internal entity is now the triple
[segment, domain, name]. 
A consequence of this redefinition of entities is that relations and domains do not span segments, either.  Application programs 
must maintain any desired correspondence between entities, domains, or relations with the same name in different segments.  We will 
return to this later.  In the next section, we will discuss a more powerful but more complex and expensive mechanism, augments, in 
which the database system itself maintains the correspondence.
We introduce three new operations to the data model in conjunction with segments:
DeclareSegment[segment, file]:  opens a segment with the given name, whose data is stored in the given file.
GetSegments[] returns a list of all the segments which have been opened.
SegmentOf[entity or relationship] returns the segment in which a given entity or relationship exists.  It may also be applied to 
relations or domains, since they are entities.
With the addition of segments to the data model, we redefine the semantics of the basic access operations as follows:
1.   DeclareDomain and DeclareRelation take an additional argument, namely the segment in which the defined domain or relation will 
reside.  The entity representing a domain or relation now represents data in a particular segment.
2.   DeclareEntity and DeclareRelationship are unaffected:  they implicitly refer to the segment in which the respective domain or 
relation was defined.  By associating a segment (and therefore a transaction and underlying file) with each relation or domain 
entity returned to the database client, we conveniently obviate the need for additional arguments to every invocation of the basic 
operations in the data model.
3.   DestroyEntity, DestroyRelationship, GetF, SetF, DomainOf, RelationOf, and Eq are similarly unaffected:  they deal with 
entities and relationships in whatever segment they are defined.  Note that by our definition, entities in different segments are 
never Eq.  Also note that nothing in our definition makes a SetF across a segment boundary illegal (i.e.  SetF[relationship, 
attribute, entity] where the relationship and entity are in different segments).  Our current implementation requires that special 
procedures GetFR and SetFR be used on attributes that can cross segment boundaries, see Section 3.
4.   DomainSubset and RelationSubset are unchanged when applied to client-defined domains or relations, i.e., they enumerate only 
in the segment in which the relation or domain was declared.  However an optional argument may be used when applied to one of the 
system domains or relations (e.g. the Domain domain), allowing enumeration over a specific segment or all segments.  
RelationSubset's attribute-value-list arguments implictly indicate the appropriate segment even for system relations, so a segment 
is not normally needed unless the entire relation is enumerated.
Note that the data in a segment is stored in an underlying file physically independent from other segments, perhaps on another 
machine.  Introducing a file system into the conceptual data model may seem like an odd transgression at this point.  From a 
practical point of view, however, we believe it better to view certain problems at the level of file systems.  This point of view 
allows segments to be used for the following purposes:
1.   Physical independence:  Different database applications typically define their data in separate segments.  As a result one 
application can continue to operate although the data for another has been logically or physically damaged.  One application can 
entirely rebuild its database without affecting another, or an application can continue to operate in a degraded mode missing data 
in an unavailable segment.
2.   Logical independence:  Different database applications may have information which pertains to the same external entity, e.g. a 
person with a particular social security number.  When one application performs a DestroyEntity operation, however, we would like 
the entity to disappear only from that application's point of view.  Information maintained by other applications should remain 
unchanged. 
3.   Protection:  Clients can trust the protection provided by a file system more easily than a complex logical protection 
mechanism provided by the database system.  An even higher assurance of protection can be achieved by physical isolation of the 
segment at a particular computer site.  A more complex logical protection mechanism would be desirable for some purposes, but was 
deemed beyond the scope of Cypress.
4.   Performance:  Data may be distributed to sites where they are most frequently used.  For example, personal data may reside on 
a client's machine while publicly accessed data reside on a file server.  If the file system provides replication, it can be used 
to improve performance for commonly accessed data.
Concurrency control is handled by the file system. 
As noted earlier, information about an external entity may be distributed over multiple segments.  One or more database 
applications may cooperate in maintaining the illusion that entities, domains, and relations span segment boundaries.  This 
illusion may be used in at least two ways:
1.   Private additions may be added to a public segment by adding entities or relationships in a private segment.  The new 
relationships may reference entities in the public segment by creating representative entities with the same name in the private 
segment.  An example would be personal phone numbers and addresses added to a public database of phone numbers and addresses:  an 
application program would make the two segments appear to the user as one database.
2.   If two applications use separate segments A and B, they may safely reference each other's data yet remain physically 
independent.  One of the applications may destroy and reconstruct its segment if it uses the same unique names for its entities.  
If both applications have relationships referencing an entity e, and application A does a DestroyEntity operation on e, the entity 
and relationships referencing it disappear from application A's point of view, but application B's representative entity and 
relationships remain.
3.  Model Level Interface
MODEL LEVEL INTERFACE
MODEL LEVEL INTERFACE
We now describe the Cedar interface to the implementation of the Cypress data model.  We assume that the reader is familiar with 
the basic conceptual data model, i.e., has read the previous section.  Our presentation is therefore slightly different in this 
section:  we describe the procedures in the database interface in roughly the order that a client will want to use them in a 
program.  We present types and initialization, schema definition, the basic operations, and then queries. 
It should be emphasized that the interface we are about to describe is only one possible implementation of the abstract data model 
described in Section 2.  For example, we have chosen to implement a procedural interface called by Cedar programs, and to do type 
checking at run-time.
3.1 Types
In this subsection we describe the most important types in the interface.  Less pervasive types are treated at the point where they 
are first used.
Entity: TYPE;
Relship: TYPE;
An Entity or Relship is not the actual database entity or relationship; they are handles for the actual database objects.  All 
accesses to database objects are performed by calling interface procedures with the handles as parameters.  Even comparisons of two 
entities for equality must be done in this way.  The Entity and Relship handles are allocated from storage and automatically freed 
by the garbage collector when no longer needed.
Value: TYPE = REF ANY;
ValueType: TYPE;
Datatype: TYPE;
RopeType, IntType, BoolType, AnyDomainType: DataType;
Storing Cedar data values in tuples presents several problems.  First, since we would like to define a single operation to store a 
new value into a specified attribute of a Relship (for instance), there must be a single type for all values that pass through this 
"store-value" procedure.  This is the type Value above, represented as untyped REFs in Cedar.  The DataTypes will be discussed in 
the next section.  Entities, strings, integers, and booleans are the types of values the system currently recognizes and allows as 
attribute values.  More precisely, these four types are Entity, ROPE, REF INT, and REF BOOL.  In the case of an entity-valued 
attribute, an attribute's type may be AnyDomainType or a specific domain may be specified.  The latter is highly preferred, as 
AnyDomainType is a loophole in the type mechanism and limits the kinds of operations that can be performed automatically by the 
database system or associated tools.  We currently provide no mechanism to store compound Cedar data structures such as arrays, 
lists, or records in a database; the database system's data structuring mechanisms should be used instead.  Cypress query 
operations such as RelationSubset cannot be composed upon data that appears as uninterpreted bits in the database.
Note that a Value may be either an Entity or a Datum.  Some operations accept any Value, e.g. SetF; others require an Entity, e.g. 
NameOf.  Others may require an Entity from a particular client-defined domain, e.g. a Person.  We might think of the hierarchy of 
built-in and client defined types and instances of values like this:
Value type hierarchy    Database representative of type
Value (REF ANY)            ValueType
Datum               DatumType
ROPE                RopeType
INT                IntType
BOOL                BoolType
Entity                AnyDomainType
person Entity            Person domain
employee Entity           Employee domain
... other client-defined entities ...       ... other client-defined domains ...
As Cedar doesn't have a good mechanism for defining type hierarchies or new types for client-defined domains, most Cypress 
operations simply take a REF ANY or an Entity as argument, performing further type checking at run-time.
3.2 Transactions and segments
In this section we describe the basic operations to start up a database application's interaction with Cypress.  The client 
application's data is stored in one or more segments, accessed under transactions.  The Cypress system currently runs on the same 
machine as the client program, however transactions are implemented by the underlying file system which may reside on another 
machine.  Data in remote segments may therefore be concurrently accessed by other instances of Cypress on other client machines.
A transaction is a sequence of read and write commands.  The system supports the property that the entire sequence of commands 
executes atomically with respect to all other data retrieval and updates, that is, the transaction executes as if no other 
transactions were in progress at the same time.  Because there may in fact be other transactions accessing the same data at the 
same time, it is possible that two transactions may deadlock, in which case one of them must be aborted.   So the price paid for 
concurrent access is that programs be prepared to retry aborted transactions.
The database system provides the capability of accessing a database stored on the same machine as the database client, using the 
Pilot file system or on Alpine file servers.  We currently permit only one transaction per segment per instance of the database 
software on a client machine.  That is, data in remote segments may concurrently be updated by application programs under separate 
transactions, but on the same machine transactions are used simply to make application transactions on their respective segments 
independent.  This transaction-per-segment scheme is a major simplification of the Cypress package.  In addition, as we shall see 
presently, nearly all Cypress procedures can automatically infer the appropriate segment and transaction from the procedure 
arguments, avoiding the need to pass the transaction or segment for every database operation. 
Calls to Initialize, DeclareSegment, and OpenTransaction start the database session.  A transaction is either passed in by the 
client, or created by the database package (the latter is just a convenience feature).  The operation MarkTransaction below forms 
the end of a database transaction and the start of a new one.  The operation AbortTransaction may be used to abort a transaction.  
Data in a database segment may not be read or updated until the segment and transaction have been opened.  Clients must decide when 
to tell the system that a transaction is complete (with CloseTransaction), and must be prepared to deal with unsolicited 
notification that the current transaction has been aborted because of system failure or lock conflict.
The client's interaction with the database system begins with a call to Initialize:
Initialize: PROC[
 nCachePages: CARDINAL_ 256,
 nFreeTuples: CARDINAL_ 32,
 cacheFileName: ROPE_ NIL ];
Initialize initializes the database system and sets various system parameters:  nCachePages tells the system how many pages of 
database to keep in virtual memory on the client's machine, nFreeTuples specifies the size to use for the internal free list of 
Entity and Relship handles, and cacheFileName is the name of the disk file used for the cache backing store.  Any or all of these 
may be omitted in the call; they will be given default values.  Initialize should be called before any other operation; the schema 
declaration operations generate the error DatabaseNotInitialized if this is violated.
Before database operations may be invoked, the client must open the segment(s) in which the data are stored.  The location of the 
segment is specified by using the full path name of the file, e.g. "[MachineName]<Directory>SubDirectory>SegmentName.segment".  
Each segment has a unique name, the name of a Cedar ATOM which is used to refer to it in Cypress operation.  The name of the Cedar 
ATOM is normally, though not necessarily, the same as that of the file in which it is stored, except the extension ".segment" and 
the prefix specifying the location of the file is omitted in the ATOM.  If the file is on the local file system, its name is 
preceded by "[Local]".  For example,  "[Local]Foo" refers to a segment file on the local disk named Foo.database; 
"[Alpine]<CedarDB>Baz" refers to a segment named Baz.segment on the <CedarDB> directory on the Alpine server.  It is generally a 
bad idea to access database segments other than through the database interface.  However, because segments are physically 
independent and contain no references to other files by file identifier or explicit addresses within files, the segment files may 
be moved from machine to machine or renamed without effect on their contents.  If a segment file in a set of segments comprising a 
client database is deleted, the others may still be opened to produce a database missing only that segment's entities and 
relationships.  A segment is defined by the operation DeclareSegment:
DeclareSegment: PROC[ 
  filePath: ROPE, segment: Segment, number: INT_ 0, 
  readOnly: BOOL_ FALSE, version: Version_ OldOnly,
  nBytesInitial, nBytesPerExtent: LONG CARDINAL_ 32768]
  RETURNS [Segment];
Segment: TYPE = ATOM;
Version: TYPE = {NewOnly, OldOnly, NewOrOld};
The version parameter to DeclareSegment defaults to OldOnly to open an existing file.  The signal IllegalFileName is generated if 
the directory or machine name is missing from fileName, and  FileNotFound is generated at the time a transaction is opened on the 
segment if the file does not exist.  If version NewOnly is passed, a new segment file will be created, erasing any existing one.  
In this case, a number assigned to the segment by the database administrator must also be passed.  This number hack is necessitated 
by our current implementation of segments (it specifies the section of the database address space in which to map this segment).  
Please bear with us.  Finally, the client program can pass version=NewOrOld to open a new or existing segment file; in this case 
the segment number must also be passed, of course.
The other parameters to DeclareSegment specify properties of the segment.  If readOnly=TRUE, then writes are not permitted on the 
segment; any attempt to invoke a procedure which modifies data will generate the error ProtectionViolation.  nBytesInitial is the 
initial size to assign to the segment, and nBytesPerExtent is the incremental increase in segment size used when more space is 
required for data in the file.
For convenience, a call is available to return the list of segments that have been declared in the current Cypress session:  
GetSegments: PROC RETURNS[LIST OF Segment ];
A transaction is associated with a segment by using OpenTransaction:  
OpenTransaction: PROC[
  segment: Segment,
  userName, password: ROPE_ NIL,
  useTrans: Transaction_ NIL ];
If useTrans is NIL then OpenTransaction establishes a new connection and transaction with the corresponding (local or remote) file 
system.  Otherwise it uses the supplied transaction.  The same transaction may be associated with more than one segment by calling 
OpenTransaction with the same useTrans argument for each.  The given user name and password, or by default the logged in user, will 
be used if a new connection must be established.
Any database operations upon data in a segment before a transaction is opened or after a transaction abort will invoke the Aborted 
signal.  The client should catch this signal on a transaction abort, block any further database operations and wait for completion 
of any existing ones.  Then the client may re-open the aborted transaction by calling OpenTransaction.  When the remote transaction 
is successfully re-opened, the client's database operations may resume.
Note that operations on data in segments under different transactions are independent.  Normally there will be one transaction (and 
one or more segments) per database application program.  A client may find what transaction has been associated with a particular 
segment by calling
TransactionOf: PROC [segment: Segment] RETURNS [Transaction];
Transactions may be manipulated by the following procedures:
MarkTransaction: PROC[trans: Transaction];
AbortTransaction: PROC [trans: Transaction];
CloseTransaction: PROC [trans: Transaction];
MarkTransaction commits the current database transaction, and immediately starts a new one.  User variables which reference 
database entities or relationships are still valid.
AbortTransaction aborts the current database transaction.  The effect on the data in segments associated with the segment is as if 
the transactions had never been started, the state is as it was just after the OpenTransaction call or the most recent 
MarkTransaction call.  Any attempts to use variables referencing data fetched under the transaction will invoke the 
NullifiedArgument error.  A call to OpenTransaction is necessary to do more database operations, and all user variables referencing 
database items created or retrieved under the corresponding transaction must be re-initialized (they may reference entities or 
relationships that no longer exist, and in any case they are marked invalid by the database system).
A simple client program using the database system might have the form, then:
Initialize[];
DeclareSegment["[Local]Test", $Test];
OpenTransaction[$Test];
...
... database operations, including zero or more MarkTransaction calls ...
...
CloseTransaction[TransactionOf[$Test]];
3.3 Data schema definition
The definition of the client's data schema is done through calls to procedures defined in this section.  The data schema is 
represented in a database as entities and relationships, and although updates to the schema must go through these procedures to 
check for illegal or inconsistent definitions, the schema can be read via the normal data operations described in the next section. 
 Each domain, relation, etc., has an entity representative that is used in data operations which refer to that schema item.  For 
example, we pass the domain entity when creating a new entity in the domain.  The types of schema items are:
Domain, Relation, Attribute, Datatype, Index, IndexFactor: TYPE = Entity;
Of course, since the schema items are entities, they must also belong to domains; there are pre-defined domains, which we call 
system domains, in the interface for each type of schema entity:
DomainDomain, RelationDomain, AttributeDomain, DatatypeDomain, IndexDomain: Domain;
There are also pre-defined system relations, which contain information about sub-domains, attributes, and indices.  Since these are 
not required by the typical (application-specific) database client, we defer the description of the system relations to Section 
3.6. 
In general, any of the data schema may be extended or changed at any time; i.e., data operations and data schema definition may be 
intermixed.  However, there are a few specific ordering constraints on schema definition we will note shortly.  Also, the database 
system optimizes for better performance if the entire schema is defined before any data are entered.  The interactive schema 
editing tool described in the database tools documentation allows the schema to be changed regardless of ordering constraints and 
existing data, by recreating schema items and copying data invisibly to the user when necessary.
All the data schema definition operations take a Version parameter which specifies whether the schema element is a new or existing 
one.  The version defaults to allowing either (NewOrOld):  i.e., the existing entity is returned if it exists, otherwise it is 
created.  This feature avoids separate application code for creating the database schema the first time the application program is 
run.   
DeclareDomain: PROC [name: ROPE, segment: Segment,
  version: Version_ NewOrOld, estRelations: INT_ 5] RETURNS [d: Domain];
DeclareSubType: PROC[sub, super: Domain];
DeclareDomain defines a domain with the given name in the given segment and returns its representative entity.  If the domain 
already exists and version=NewOnly, the signal AlreadyExists is generated.  If the domain does not already exist and 
version=OldOnly, then NIL is returned.  The parameter estRelations is used to estimate the largest number of relations in which 
entities of this domain are expected to participate.
The client may define one domain to be a subtype of another by calling DeclareSubType.  This permits entities of the subdomain to 
participate in any relations in which entities of the superdomains may participate.  All client DeclareSubType calls should be done 
before declaring relations on the superdomains (to allow some optimizations).  The error MismatchedSegment is generated if the 
sub-domain and super-domain are not in the same segment.
DeclareRelation: PROC [
  name: ROPE, segment: Segment, version: Version_ NewOrOld] RETURNS [r: Relation];
DeclareAttribute: PROC [
  r: Relation, name: ROPE, type: ValueType_ NIL,
  uniqueness: Uniqueness _ None,  length: INT_ 0,
  link: {Linked, Unlinked, Colocated, Remote}_ yes, version: Version_ NewOrOld]
  RETURNS[a: Attribute];
Uniqueness: TYPE = {NonKey, Key, KeyPart, OptionalKey};
DeclareRelation defines a new or existing relation with the given name in the given segment and returns its representative entity.  
If the relation already exists and version=NewOnly, the signal AlreadyExists is generated.  If the relation does not already exist 
and version=OldOnly, then NIL is returned.
DeclareAttribute is called once for each attribute of the relation, to define their names, types, and uniqueness.  If 
version=NewOrOld and the attribute already exists, Cypress checks that the new type, uniqueness, etc. match the existing attribute. 
 The error MismatchedExistingAttribute is generated if there is a discrepancy.  The attribute name need only be unique in the 
context of its relation, not over all attributes.  Note this is the only exception to the data model's rule that names be unique in 
a domain.  Also note that we could dispense with DeclareAttribute altogether by passing a list into the DeclareRelation operation; 
we define a separate procedure for programming convenience.
The attribute type should be a ValueType, i.e. it may be one of the pre-defined types (IntType, RopeType, BoolType, AnyDomainType) 
or the entity representative for a domain.  For pre-defined types, the actual values assigned to attributes of the relationship 
instances of the relation must have the corresponding type:  REF INT, ROPE, REF BOOL, or Entity.  If the attribute has a domain as 
type, the attribute values in relationships must be entities of that domain or some sub-domain thereof.  The type is permitted to 
be one of the pre-defined system domains such as the DomainDomain, thereby allowing client-defined extensions to the data schema 
(for example, a comment for each domain describing its purpose). 
The attribute uniqueness indicates whether the attribute is a key of the relation.   If its uniqueness is NonKey, then the 
attribute is not a key of the relation.  If its uniqueness is OptionalKey, then the system will ensure that no two relationships in 
r have the same value for this attribute (if a value has been assigned).  The error NonUniqueKeyValue is generated if a non-unique 
key value results from a call to the SetP, SetF, SetFS, or CreateRelship procedures we define later.  Key acts the same as 
OptionalKey, except that in addition to requiring that no two relationships in r have the same value for the attribute, it requires 
that every entity in the domain referenced by this attribute must be referenced by a relationship in the relation:  the 
relationships in the relation and the entities in the domain are in one-to-one correspondence.  Finally, if an attribute's 
uniqueness is KeyPart, then the system will ensure that no two relationships in r have the same value for all key attributes of r, 
though two may have the same values for some subset of them.
The length and link arguments to DeclareAttribute have no functional effect on the attribute, but are hints to the database system 
implementation.  For RopeType fields, length characters will be allocated for the string within the space allocated for a 
relationship in the database.  There is no upper limit on the size of a string-valued attribute; if it is longer than length, it 
will be stored separately from the relationship with no visible effect except for the performance of database applications.  The 
link field is used only for entity-valued fields; it suggests whether the database system should link together relationships which 
reference an entity in this attribute.  In addition, it can suggest that the relationships referencing an entity in this attribute 
be physically co-located as well as linked.  Again, its logical effect is only upon performance, not upon the legal operations.
DestroyRelation: PROC[r: Relation];
DestroyDomain: PROC[d: Domain];
DestroySubType: PROC[sub, super: Domain];
Relations, domains, and subdomain relationships may be destroyed by calls to the above procedures.  Destroying a relation destroys 
all of it relationships.  Destroying a domain destroys all of its entities and also any relationships which reference those 
entities.  Destroying a sub-domain relationship has no effect on existing domains or their entities; it simply makes entities of 
domain sub no longer eligible to participate in the relations in which entities of domain super can participate.  Existing 
relationships violating the new type structure are allowed to remain.  Existing relations and domains may only be modified by 
destroying them with the procedures above, with one exception:  the operation ChangeName (described in Section 3.4) may be used to 
change the name of a relation or domain.
DeclareIndex: PROC [
  relation: Relation, indexedAttributes: AttributeList, version: Version];
DeclareIndex has no logical effect on the database; it is a performance hint, telling the database system to create a B-Tree index 
on the given relation for the given indexedAttributes.  The index will be used to process queries more efficiently.  Each index key 
consists of the concatenated values of the indexedAttributes in the relationship the index key references.  For entity-valued 
attributes, the value used in the key is the string name of the entity.  The version parameter may be used as in other schema 
definition procedures, to indicate a new or existing index.  If any of the attributes are not attributes of the given relation then 
the signal IllegalIndex is generated.
The optimal use of indices, links, and colocation, as defined by DeclareIndex and DeclareAttribute, is complex.  It may be 
necessary to do some space and time analysis of a database application to choose the best trade-off, and a better trade-off may 
later be found as a result of unanticipated access patterns.  Note, however, that a database may be rebuilt with different links, 
colocation, or indices, and thanks to the data independence our interface provides, existing programs will continue to work without 
change.
If a relation is expected to be very small (less than 100 relationships), then it might reasonably be defined with neither links 
nor indices on its attributes.  In the typical case of a larger relation, one should examine the typical access paths:  links are 
most appropriate if relationships that pertain to particular entities are involved, indices are more useful if sorting or range 
queries are desired.
B-tree indices are always maintained for domains; that is, an index contains entries for all of the entities in a domain, keyed by 
their name, so that sorting or lookup by entity name is quick.  String comparisons are performed in the usual lexicographic fashion.
DeclareProperty: PROC [
  relationName: ROPE, of: Domain, type: ValueType,
  uniqueness: Uniqueness_ None, version: Version_ NewOrOld]
  RETURNS [property: Attribute];
DeclareProperty provides a shorthand for definition of a binary relation between entities of the domain "of" and values of the 
specified type.  The definitions of type and uniqueness are the same as for DeclareAttribute.  A new relation relationName is 
created, and its attributes are given the names "of" and "is".  The "is" attribute is returned, so that it can be used to represent 
the property in GetP and SetP defined in the next section.
3.4 Basic operations on entities and relationships
In this section, we describe the basic operations on entities and relationships; we defer the operations on domains and relations 
to the next section.
A number of error conditions are common to all of the procedures in this section.  Since values are represented as REF ANYs, all 
type checking must currently be done at run-time.  The procedures in this section indicate illegal arguments by generating the 
errors IllegalAttribute, IllegalDomain, IllegalRelation, IllegalValue, IllegalEntity, and IllegalRelship, according to the type of 
argument expected.  The error NILArgument is generated if NIL is passed to any procedure that cannot accept NIL for that argument. 
The error NullifiedArgument is generated if an entity or relationship is passed in after it has been deleted or rendered invalid by 
transaction abort or close. 
DeclareEntity:  PROC[
  d: Domain, name: ROPE_ NIL, version: Version_ NewOrOld]
  RETURNS [e: Entity];
DeclareEntity finds or creates an entity in domain d with the given name.  The name may be omitted if desired, in which case an 
entity with a unique name is automatically created.  If version is OldOnly and an entity with the given name does not exist, NIL is 
returned.  If version is NewOnly and an entity with the given name already exists, the signal NonUniqueEntityName is generated.
DeclareRelship: PROC [
  r: Relation, avl: AttributeValueList_ NIL, version: Version_ NewOrOld] 
  RETURNS [Relship];
DeclareRelship finds or creates a relship in r with the given attribute values.  If version is NewOnly, a new relship with the 
given attribute values is generated.  If version is OldOnly, the relship in r with the given attribute values is returned if it 
exists, otherwise NIL is returned.  If version is NewOrOld, the relship with the given attribute values is returned if it exists, 
otherwise one is created.  If the creation of a new relship violates the key constraints specified by DeclareAttribute, the signal 
NonUniqueAttributeValue is generated.
DestroyEntity: PROC[e: Entity];
DestroyEntity removes e from its domain, destroys all relationships referencing it, and destroys the entity representative itself.  
Any client variables that reference the entity automatically take on the null value (Null[e] returns TRUE), and cause error 
NullifiedArgument if passed to database system procedures.  After an entity is destroyed, its old name may be re-used in creating a 
new one.
DestroyRelship: PROC[t: Relship];
DestroyRelship removes t from its relation, and destroys it.  Any client variables that reference the relationship automatically 
take on the null value, and will cause error NullifiedArgument if subsequently passed to database system procedures.
SetF: PROC[t: Relship, a: Attribute, v: Value];
SetF assigns the value v to attribute a of relationship t.  If the value is not of the same type as the attribute (or a subtype 
thereof if the attribute is entity-valued), then the error MismatchedAttributeValueType is generated.  If a is not an attribute of 
t's relation, IllegalAttribute is generated.
GetF: PROC[t: Relship, a: Attribute] RETURNS [Value];
GetF retrieves the value of attribute a of relationship t.  If a is not an attribute of t's relation, error IllegalAttribute is 
generated.  The client should use the V2x routines described in the next section to coerce the value into the expected type.
SetFS: PROC [t: Relship, a: Attribute, v: ROPE];
GetFS: PROC[t: Relship, a: Attribute] RETURNS [ROPE];
GetFS and SetFS provide a convenient veneer on top of GetF and SetF that provide the illusion that all relation attributes are 
string-valued.  The effect is something like the Relational data model, and is useful for applications such as a relation displayer 
and editor that deal only with strings.  The semantics of GetFS and SetFS depend upon the actual type of the value v of attribute a:
type        GetFS returns    SetFS assigns attribute to be
RopeType            the string v        the string v
IntType              v as a decimal string    the string as a decimal integer
BoolType            "TRUE" or "FALSE"    true if "TRUE", false if "FALSE"
a domain D          the entity name        the entity with name v
            ("" if v = NIL)         (or NIL if v is null string)
AnyDomainType   <domain-name>:<entity-name> the entity with given domain, name
                (or NIL if v is null string)
The same signals generated by GetF and SetF, such as IllegalAttribute, can also be generated by these procedures.  The string NIL 
represents the undefined value.  The signal NotFound is generated in the last case above if no entity with the given name is found.
NameOf: PROC [e: Entity] RETURNS [s: ROPE];
ChangeName: PROC [e: Entity, s: ROPE];
NameOf and ChangeName retrieve or change the name of an entity, respectively.  They generate the signal IllegalEntity if e is not 
an entity.
ChangeName should be used with caution.  It is not quite equivalent to destroying and re-creating an entity with the new name but 
the same existing relationships referencing it.  ChangeName is considerably faster than that, and furthermore entity-valued 
variables which reference the entity are not nullified by ChangeName, though they would be by DestroyEntity.  These features should 
be a help, not a hindrance.  However, changing an entity name may invalidate references to the entity from outside the segment, 
e.g. in another segment or in some application-maintained file such as a log of updates.   
DomainOf: PROC[e: Entity] RETURNS [Domain];
RelationOf: PROC[t: Relship] RETURNS [Relation];
DomainOf and RelationOf can be used to find the entity representative of an entity's domain or a relationship's relation, 
respectively.  The signal IllegalEntity is generated if e is not an entity.  The signal IllegalRelship is generated if r is not a 
relationship.
SegmentOf: PROC[e: Entity] RETURNS [Segment];
SegmentOf returns the segment in which an entity is stored.  It can be applied to domain, relation, or attribute entities.
Eq: PROC [e1: Entity, e2: Entity] RETURNS [BOOL];
Eq returns TRUE iff the same database entity is referenced by e1 and e2.  This is not equivalent to the Cedar expression "e1 = e2", 
which computes Cedar REF equality.  If e1 and e2 are in different segments, Eq returns true iff they have the same name and their 
domains have the same name.
Null: PROC [x: EntityOrRelship] RETURNS [BOOL];
Null returns TRUE iff its argument has been destroyed, is NIL, or has been invalidated by abortion of the transaction under which 
it was created.
GetP: PROC [e: Entity, aIs: Attribute, aOf: Attribute_ NIL] RETURNS [Value];
SetP: PROC [e: Entity, aIs: Attribute, v: Value, aOf: Attribute_ NIL] RETURNS[Relship];
GetP and SetP are convenience routines for a common use of relationships, to represent "properties" of entities.  Properties allow 
the client to think of values stored in relationships referencing an entity as if they are directly accessible fields (or 
"properties") of the entity itself.  See the figure on page 15 illustrating properties.  GetP finds a relationship whose from 
attribute is equal to e, and returns that relationship's to attribute.  The from attribute may be defaulted if the relation is 
binary, it is assumed to be the other attribute of to's relation.  If it is not binary, the current implementation will find the 
"first" other attribute, where "first" is defined by the order of the original calls calls to DeclareAttribute.  SetP defaults the 
from attribute similarly to GetP, but operates differently depending on whether from is a key of the relation.  Whether it is a key 
or not, any previous relationship that referenced e in the from attribute is automatically deleted.  In either case, a new 
relationship is created whose from attribute is e and whose to attribute is v.  SetP returns the relationship it creates for the 
convenience of the client.  GetP and SetP can generate the same errors as SetF and GetF, e.g. if e is not an entity or to is not an 
attribute.  In addition, GetP and SetP can generate the error IllegalProperty if to and from are from different relations.  GetP 
generates the error MismatchedPropertyCardinality if more than one relationship references e in the from attribute; if no such 
relationships exist, it returns a null value of the type of the to attribute.  SetP allows any number of existing relationships 
referencing e; it simply adds another one (when from is a key, of course, there will always be one relationship).  
GetPList: PROC [e: Entity, to: Attribute, from: Attribute_ NIL] RETURNS [LIST OF Value];
SetPList: PROC [e: Entity, to: Attribute, vl: LIST OF Value, from: Attribute_ NIL];
GetPList and SetPList are similar to GetP and SetP, but they assume that any number of relationships may reference the entity e 
with their from attribute.  They generate the signal MismatchedPropertyCardinality if this is not true, i.e. the from attribute is 
a key.  GetPList returns the list of values of the to attributes of the relationships that reference e with their from attribute.  
Cedar has LISP-like list manipulation facilities.  SetPList destroys any existing relationships that reference e with their from 
attribute, and creates Length[vl] new ones, whose from attributes reference e and whose to attributes are the elements of vl.  
GetPList and SetPList may generate any of the errors that GetF and SetF may generate, and the error IllegalProperty if to and from 
are from different relations.
Note that the semantics of SetPList are not quite consistent with the semantics of SetP.  SetPList replaces the current values 
associated with a "property" with the new values (i.e., destroys and re-creates relationships); SetP adds a new property value, 
unless the aOf attribute is a key, in which case it replaces the current value.  The semantics are defined in this way because this 
has proven the most convenient in our application programs.
Examples of the use of the property procedures for data access can be found in Section 4.3.  Properties are also useful for 
obtaining information about the data schema.  For example, GetP[a, aRelationIs] will return the attribute a's relation, and 
GetPList[d, aTypeOf] will return all the attributes that can reference domain d.
E2V: PROC[e: Entity] RETURNS[v: Value];
B2V: PROC[b: BOOLEAN] RETURNS[v: Value];
I2V: PROC[i: LONG INTEGER] RETURNS[v: Value];
S2V: PROC[s: ROPE] RETURNS[v: Value];
The x2V routines convert the various Cedar types to Values.  The conversion is not normally required for ropes and entities since 
the compiler will widen these into the REF ANY type Value.
V2E: PROC[v: Value] RETURNS[Entity];
V2B: PROC[v: Value] RETURNS[BOOLEAN];
V2I: PROC [v: Value] RETURNS[LONG INTEGER];
V2S: PROC [v: Value] RETURNS[ROPE];
The V2x routines convert Values to the various Cedar types.  The MismatchedValueType error is raised if the value is of the wrong 
type.  It is recommended that these routines be used rather than user-written NARROWs, as the representation of Values may change.  
Also, NARROWs of opaque types don't yet work in the Cedar compiler.
3.5 Query operations on domains and relations
In this section we describe queries upon domains and relations:  operations that enumerate entities or relationships satisfying 
some constraint.
RelationSubset: PROC[
  r: Relation, constraint: AttributeValueList_ NIL] 
  RETURNS [RelshipSet];
NextRelship: PROC[rs: RelshipSet] RETURNS [Relship];
PrevRelship: PROC[rs: RelshipSet] RETURNS [Relship];
ReleaseRelshipSet: PROC [rs: RelshipSet];
AttributeValueList: TYPE = LIST OF AttributeValue;
AttributeValue: TYPE = RECORD [
   attribute: Attribute,
   low: Value,
   high: Value_ NIL -- omitted where same as low or not applicable --];
The basic query operation is RelationSubset.  It returns a generator of all the relationships in relation r which satisfy a 
constraint list of attribute values.  The relationships are enumerated by calling NextRelship repeatedly; it returns NIL when there 
are no more relationships.  PrevRelship may similarly be called repeatedly to back the enumeration up, returning the previous 
relationship; it returns NIL if the enumeration is at the beginning.  ReleaseRelshipSet should be called when the client is 
finished with the query.
The constraint list may be NIL, in which case all of the relationships in r will be enumerated.  Otherwise, relationships which 
satisfy the concatenation of constraints on attributes in the list will be enumerated.  If an index exists on some subset of the 
attributes, the relationships will be enumerated sorted on the concatenated values of those attributes.  For a RopeType, IntType, 
or TimeType attribute a of r, the contraint list may contain a record of the form [a, b, c] where the attribute value must be 
greater or equal to b and less than or equal to c to satisfy the constraint.  For any type of attribute, the list may contain a 
record of the form [a, b] where the value of the attribute must exactly equal b.  The Cedar ROPE literals "" and "\377" may be used 
in queries as an infinitely large and infinitely small string, respectively.  The signal MismatchedAttributeValueType is generated 
by RelationSubset if one of the low or high values in the list is of a different type than its corresponding attribute.
DomainSubset: PROC[
  d: Domain,
  lowName, highName: ROPE_ NIL, 
  searchSubDomains: BOOL_ TRUE,
  searchSegment: Segment_ NIL]
  RETURNS [EntitySet];
NextEntity: PROC[EntitySet] RETURNS [Entity];
PrevEntity: PROC[EntitySet] RETURNS [Entity];
ReleaseEntitySet: PROC[EntitySet];
DomainSubset enumerates all the entities in a domain.  If lowName and highName are NIL, the entire domain is enumerated, in no 
particular order.  Otherwise, only those entities whose names are lexicographically greater than lowName and less than highName are 
enumerated, in lexicographic order.  If searchSubDomains is TRUE, subdomains of d are also enumerated.  Each subdomain is sorted 
separately.  The searchSegment argument is currently only used if d is one of the system domains, e.g. the Domain domain.  It is 
used to specify which segment to search.
Analogously to relation enumeration, NextEntity and PrevEntity may be used to enumerate the entities returned by DomainSubset, and 
ReleaseEntitySet should be called upon completion.
GetDomainRefAttributes: PROC [d: Domain] RETURNS [AttributeList];
This procedure returns a list of all attributes, of any relation defined in d's segment, which reference domain d or one of its 
superdomains.  The list does not include AnyDomainType attributes, which can reference any domain.  GetDomainRefAttributes is 
implemented via queries on the data schema.  GetDomainRefAttributes is useful for application-independent tools; most specific 
applications can code-in the relevant attributes.
GetEntityRefAttributes: PROC [e: Entity] RETURNS [AttributeList];
This procedure returns a list of all attributes in which some existing relationship actually references e, including AnyDomainType 
attributes.
3.6 System domains and relations
In this section we describe what one might call the schema schema, the pre-defined system domains and relations which constitute 
the data schema for client-defined domains and relations.  The typical database application writer may skip this section, since the 
schema declaration operations defined in Section 3.3 are adequate when the data schema is completely defined and known at the time 
a program is written.  The system domains and relations we describe in this section are most useful for general-purpose tools (e.g. 
for displaying, querying, or dumping any database), where the tools must examine the data schema "on the fly".
As noted earlier, the permanent repository for data describing user-defined data in a database is the database's data schema, 
represented by schema entities and relationships.  Schema entities are members of one of the pre-defined system domains:  
DomainDomain, RelationDomain, DatatypeDomain, and so on.  Every client-defined domain, relation, or attribute contains a 
representative entity in these domains.  Client-defined datatypes are not currently permitted, so the only entities in the DataType 
domain are the pre-defined IntType, RopeType, and BoolType.
The information about the client-defined domains and attributes are encoded by relationships in the database.  Domains participate 
in the system relation dSubType, which encodes a domain type hierarchy:
dSubType: Relation;
  dSubTypeOf: Attribute; -- the domain in this attribute is a super-type of
  dSubTypeIs: Attribute; -- the domain in this attribute
The dSubType has one element per direct domain-subdomain relationship, it does not contain the transitive closure of that relation. 
 However, it is guaranteed to contain no cycles.  That is, the database system checks that there is no set of domains d1, d2, ... 
dN, N>1, such that d1 is a subtype of d2, d2 is a subtype of d3, and so on to dN, and d1=dN.  The dSubType may define a lattice as 
opposed to a tree, i.e. the sSubType attribute is not a key of the relation.
The information about attributes is encoded as binary relations, one relation for each argument to the DeclareAttribute procedure 
defining properties of the attribute.  The names are easy to remember; for each argument, e.g. Foo, we define the aFoo relation, 
with attributes aFooOf and aFooIs.  The aFooIs attribute is the value of that argument, and the aFooOf attribute is [the entity 
representative of] the attribute it pertains to.  Thus we have the following relations:
  aRelation: PUBLIC READONLY Relation; -- Specifies attribute - relation correspondence:
    -- [aRelationOf: KEY Attribute, aRelationIs: Relation]
    aRelationOf: PUBLIC READONLY Attribute; -- attribute whose relation we are specifying
    aRelationIs: PUBLIC READONLY Attribute; -- the relation of that attribute

  aType: PUBLIC READONLY Relation; -- Specifies types of relation attributes:
    -- [aTypeOf: KEY Attribute, aTypeIs: ValueType]
    aTypeOf: PUBLIC READONLY Attribute; -- the attribute
    aTypeIs: PUBLIC READONLY Attribute; -- domain or datatype of the attribute

  aUniqueness: PUBLIC READONLY Relation; -- Specifies attribute value uniqueness:
   -- [aUniquenessOf: KEY Attribute, aUniquenessIs: INT LOOPHOLE[Uniqueness]]
    aUniquenessOf: PUBLIC READONLY Attribute; -- the attribute
    aUniquenessIs: PUBLIC READONLY Attribute; -- INT for Uniqueness: 0=None, 1=Key, etc.

  aLength: PUBLIC READONLY Relation; -- Specifies length of attributes:
    -- [aLengthOf: KEY Attribute, aLengthIs: INT]
    aLengthOf: PUBLIC READONLY Attribute; -- the attribute
    aLengthIs: PUBLIC READONLY Attribute;  -- INT corresponding to attribute's length

  aLink: PUBLIC READONLY Relation; -- Specifies whether attribute is linked:
    -- [aLinkOf: KEY Attribute, aLinkIs: INT]
    aLinkOf: PUBLIC READONLY Attribute; -- the attribute
    aLinkIs: PUBLIC READONLY Attribute;  -- 0=unlinked, 1=linked, 2 =colocated 
The final set of system relations pertain to index factors.  Each index on a relation is defined to include one or more attributes 
of a relation.  For each attribute in the index, there is an index factor entity.  For each index, there is an index entity.  Each 
index factor is associated with exactly one index and exactly one attribute.  Indices may have many index factors, however, and an 
attribute may be associated with more than one index factor, since attributes may participate in multiple indices.  The two 
relations pertaining to indices map indices on to their index factors, and index factors to the attributes they index:
  ifIndex: PUBLIC READONLY Relation; -- Specifies the index factors for each index
    -- [ifIndexOf: KEY IndexFactor, ifIndexIs: Index]
    ifIndexOf: PUBLIC READONLY Attribute;  -- the index factor
    ifIndexIs: PUBLIC READONLY Attribute;  -- index of the factor

  ifAttribute: PUBLIC READONLY Relation; -- Specifies attribute index factor corresponds to
    -- [ifAttributeOf: KEY IndexFactor, ifAttributeIs: Attribute]
    ifAttributeOf: PUBLIC READONLY Attribute; -- the index factor
    ifAttributeIs: PUBLIC READONLY Attribute; -- the attribute this factor represents
The relations on attributes, index factors, and domains can be queried with the RelationSubset or GetPList operations.  For 
example, GetP[a, aRelationIs] returns the attribute a's relation.  GetPList[r, aRelationOf] returns the relation r's attributes.  
RelationSubset[dSubType, LIST[[dSubTypeIs, d]]] will enumerate all the dSubType relationships in which d is the subtype.
As noted earlier, the data schema (attributes, relations, domains, indices, index factors, and relations pertaining to these) may 
only be read, not written by the database client.  In order to ensure the consistency of the schema, it must be written indirectly 
through the schema definition procedures:  DeclareDomain, DeclareRelation, DeclareAttribute, and DeclareSubType.  Attempts to 
perform updates through operations such as SetP result in the error ImplicitSchemaUpdate.
3.7 Errors
When a database system operation invokes an error, the SIGNAL Error is generated, with an error code indicating the type of error 
that occured.  The error code is a Cedar enumerated type:
Error: SIGNAL [code: ErrorCode];
ErrorCode: TYPE = {
AlreadyExists, -- Entity already exists and client said version=NewOnly
BadUserPassword, -- On an OpenTransaction
DatabaseNotInitialized, -- Attempt to do operation without calling Initialize
FileNotFound, -- No existing segment found with given name
IllegalAttribute, -- Attribute not of the given relship's Relation or not an attribute
IllegalValueType, -- Type passed DeclareAttribute is not datatype or domain
IllegalDomain, -- Argument is not actually a domain
IllegalFileName, -- No directory or machine given for segment
IllegalEntity, -- Argument to GetP, or etc., is not an Entity
IllegalRelship, -- Argument to GetF, or etc., is not a Relship
IllegalRelation, -- Argument is not a relation
IllegalSegment, -- Segment passed to DeclareDomain, or etc., not yet declared
IllegalString, -- Nulls not allowed in ROPEs passed to the database system
IllegalSuperType, -- Can't define subtype of domain that already has entities
IllegalValue, -- Value is not REF INT, ROPE, REF BOOL, or Entity
IllegalValueType, -- Type passed DeclareAttribute is not datatype or domain
ImplicitSchemaUpdate, -- Attempt to modify schema with SetP, DeclareEntity, etc.
InternalError, -- Impossible internal state (possibly bug or bad database)
MismatchedProperty, -- aOf and aIs attribute not from the same relation
MismatchedAttributeValueType, -- Value not same type as required (SetF)
MismatchedExistingAttribute, -- Existing attribute is different (DeclareAttribute)
MismatchedExistingSegment, -- Existing segment is different (DeclareSegment)
MismatchedPropertyCardinality, -- Did GetP with aOf that is not a Key
MismatchedSegment, -- Attempt to create ref across segment boundary (SetF)
MismatchedValueType, -- value passed V2E, V2I, etc. not of expected type
MultipleMatch, -- More than one relationship satisfied avl on DeclareRelship.
NonUniqueEntityName, -- Entity in domain with that name already exists
NonUniqueKeyValue, -- Relship already exists with that value
NotFound, -- Version is OldOnly but no such Entity, Relation, or etc found
NotImplemented, -- Action requested is not yet implemented
NILArgument, -- Attempt to perform operation on NIL argument
NullifiedArgument, -- Entity or relationship has been deleted or invalidated
ProtectionViolation, -- Read or write to segment not permitted this user.
SegmentNotDeclared, -- Attempt to open transaction w/o DeclareSegment
ServerNotFound -- File server does not exist or does not respond
  };
In this report, the expression "generates the error X" means that the SIGNAL Error is generated with code=X.  Unless otherwise 
specified, the client may CONTINUE from the signal, aborting the operation in question.  Signals should not be RESUMEd except by a 
wizard who knows the result of proceeding with an illegal operation.
Two special signals are associated with the file system level:
Aborted: SIGNAL [trans: Transaction];
Failure: SIGNAL [why: ATOM, server: ROPE];
The Aborted signal can be generated by any database operation, and indicates that the transaction has been aborted.  The client 
must call AbortTransaction and may then call OpenTransaction to proceed.  The Failure signal is generated when a transaction cannot 
be open due to server failure or communication difficulties.
4.  Application Example
APPLICATION EXAMPLE
APPLICATION EXAMPLE
This section provides a simple example of the use of Cypress.  Section 4.1 introduces the example, a database of documents.  
Section 4.2 is a discussion of database design:  the process of representing abstractions of real-world information structures in a 
database, somewhat specialized to the data structures available in Cypress.  In Section 4.3, a working program is illustrated.
Our example is necessarily short; don't expect any startling revelations on these pages.  We will try to consider some of the most 
common cases, however.
4.1 A database application
What are the properties of a well-designed database?  To a large extent these properties follow from the general properties of 
databases.  For instance, we would like our databases to extend gracefully as new types of information are added, since the 
existing data and programs are likely to be quite valuable.
It may be useful to consider the following point.  The distinguishing aspect of information stored in a database system is that at 
least some of it is stored in a form that can be interpreted by the system itself, rather than only by some application-specific 
program.  Hence, one important dimension of variation among different database designs is in the amount of the database that is 
system-interpretable, i.e. the kinds of queries that can be answered by the system.
As an example of variation in this dimension, consider the problem of designing a database for organizing a collection of Mesa 
modules.  In the present Mesa environment, this database would need to include at least the names of all the definitions modules, 
program modules, configuration descriptions, and current .Bcd files.  A database containing only this information is little more 
than a file directory, and therefore the system's power to answer queries about information in this database is very limited.  A 
somewhat richer database might represent the DIRECTORY and IMPORTS sections of each module as relationships, so that queries such 
as "which modules import interface Y?" can be answered by the system.  This might be elaborated further to deal with the use of 
individual types and procedures from a definitions module, and so on.  There may be a limit beyond which it is useless to represent 
smaller objects in the database; if we aren't interested in answering queries like "what procedures in this module contain IF 
statements?", it may be attractive to represent the body of a procedure (or some smaller naming scope) as a text string that is not 
interpretable by the database system, even though it is stored in a database.   
We shall illustrate design ideas with a database of information about documents.  Our current facilities, which again are simply 
file directories, leave much to be desired.  The title of a document on the printed page does not tell the reader where the 
document is stored or how to print a copy.  Relationships between different versions of the same basic document are not explicit.  
Retrievals by content are impossible.  Our goal here is not to solve all of these problems, but to start a design that has the 
potential of dealing with some of them.
4.2 Schema design
Each document in our example database has a title and a set of authors.  Hence we might represent a collection of documents with a 
domain of entities whose name is the title of the document, and an author property specifying the authors:
Document: Domain = DeclareDomain["Domain"];
dAuthors: Property = DeclareProperty["author", Document, RopeType];
Here the authors' names are concatenated into a single string, using some punctuation scheme to allow the string to be decoded into 
the list of authors.  This is a very poor database design because it does not allow the system to respond easily to queries 
involving authors; the system cannot parse the encoded author list.
Note that in the above definition authors are strings, so anything is acceptable as an author.  This weak typing has some 
flexibility:  the database will never complain that it doesn't know the author you just attached to a certain document.  However, 
the system is not helpful in catching errors when a new document is added to the database.  If "Mark R. Brown" is mistakenly 
spelled "Mark R. Browne", then one of Mark's papers will not be properly retrieved by a later search.  A step in the direction of 
stronger type checking is to provide a separate domain for authors.
To represent authors as entities, and to allow a variable number number of authors for a document, a better design would be:
Document: Domain = DeclareDomain["Domain"];
Person: Domain = DeclareDomain["Person"];
author: Property = DeclareProperty["author", Document, Person];
Incidentally, in the last line above we define a property rather than relation for brevity.  Instead of the author property 
declaration we could have written:
author: Relation = DeclareRelation["author"];
  authorOf: Attribute = DeclareAttribute[author, "of", Document];
  authorIs: Attribute = DeclareAttribute[author, "is", Person];
The property declaration has exactly the same effect on the database as the relation declaration, since it automatically declares 
an author relation with an "of" and "is" attribute.  However, the relation is not available in a Cedar Mesa variable in the 
property case, so operations such as RelationSubset cannot be used.  Therefore most non-trivial applications will generally not use 
DeclareProperty, but will still use operations such as SetP by using one of the relation's attributes.  For example, if joe is a 
person entity and book is a document entity, one can write:
SetP[joe, authorOf, book]
or, equivalently,
SetP[book, authorIs, joe]
We now have one Document entity per document, plus, for each document, one author relationship per author of that document.  
Conversely, we have one Person entity per person, and one author relationship per document the person authored.  Of course, these 
are one and the same author relationships referencing the Person and Document entities.  Figure 4-1 illustrates a few such entities 
and relationships.  Each author relationship points to its Document entity via the authorOf attribute, and to its Person entity via 
the authorIs attribute.
Frequently a database application requires some representation of sets or lists, for example to represent the people in an 
organization or steps in a procedure.  Sets and lists are not primitives of the data model per se; sets and lists are normally 
represented as relations.  For example, in our database the set of authors of a particular document is stored via a set of author 
relationships referencing the document.  The operation
GetPList[book, authorOf]
could be used to retrieve this list for some particular book.  If we wish to maintain an ordering on this set, e.g. so that the 
authors of a book are kept in some particular order for each book, we need to use some list representation.  In our Cypress 
implementation GetPList (and RelationSubset) return relationships in the same order they were created by SetP, SetPList, or 
DeclareRelship, so that a client may maintain an order by the order of calls.  A variant of SetF and SetP is under consideration 
that allows the client to specify where new relationships should be placed in the ordering of relationships referencing a 
particular entity.  Another alternative, the one conventionally used in the Relational model, is to define another attribute to the 
relation specify position in the ordering:
  authorOrder: Attribute = DeclareAttribute[author, "order", IntType];
Using an ordering attribute is usually a better solution than depending on the semantics of the Cypress implementation's ordering, 
as it makes the ordering explicit in the relation.  In databases where a large number of relationships are expected to refer to the 
same entity, it is also more space efficient in our implementation.
If an authorOrder attribute is defined, the client may wish to redefine the authorOf attribute so that links (pointers) are not 
maintained between the Document entities and author relationships, instead defining a more space-efficient B-tree index on the 
[authorOf, authorOrder] pair:
authorOf: Attribute = DeclareAttribute[
  relation: author,  name: "of",  type: Document,  link: FALSE];

authorIndex: Index = DeclareIndex[author, LIST[[authorOf, authorOrder]]];
The Cypress implementation will use this index to process any call of the form
RelationSubset[author, LIST[[authorOf, x]]
This call to RelationSubset will therefore enumerate authors of document x sorted by authorOrder.  Cypress will also use the index 
in processing for GetPList[..., authorOf] as GetPList uses RelationSubset.
This solution is also somewhat less than perfect, as it depends upon the fact that the Cypress implementation orders relationships 
when an index exist; but indices are not intended to change the semantics of the operations, only to improve performance.  Probably 
the best solution, if the ordering is important to the semantics of a database application, is to represent a list by a binary 
"next" relation connecting the entities in an ordering.
Documents have other interesting properties.  Some of these, for example the date on which the document was produced, are in 
one-to-one correspondence with documents.  Such properties can be defined by specifying a relation or property as being keyed on 
the document:
publDate: Property = DeclareProperty["publDate", Document, RopeType, Key];
We are using the convention that domain names are capitalized and relation, attribute, and property names are not capitalized, both 
for the Cedar Mesa variable names and in the names used in the database system itself.  If and when the database system is better 
integrated with Cedar Mesa, the Cedar and database names will be one and the same.
We might wish to include additional information for particular kinds of documents, for example conference papers.  Conference 
papers may participate in the same relations as other documents.  For example, they have authors.  In addition, we may want to 
define relations in which only conference papers may participate, for example a presentation relation which defines who presented 
the paper, and where.  We can define a conference paper to be a sub-domain of documents, and define relations which pertain 
specifically to conference papers:
ConferencePaper: Domain = DeclareDomain["ConferencePaper"];
... DeclareSubType[of: Document, is: ConferencePaper]; ...
Conference: Domain = DeclareDomain["Conference"];
presentation: Relation = DeclareRelation["presentation"];
  presentationOf: Attribute = DeclareAttribute["of", presentation, ConferencePaper];
  presentationAt: Attribute = DeclareAttribute["at", presentation, Conference];
  presentationBy: Attribute = DeclareAttribute["by", presentation, Person];
Figure 4-2 illustrates a fragment of a database using this extended design.
The reader will note that we have defined our database schema in the functionally irreducible form described in Section 2.7:  i.e., 
the relations have as few attributes as possible so as to represent atomic facts.  This normalization is not necessary in the 
design of a schema, but often makes the database easier to understand and use, and avoids anomalies in data updates as a result of 
redundantly storing the same information.  Note that the presentation relation is an example of a functionally irreducible relation 
that is not binary.  It cannot be decomposed into smaller relations without losing information or introducing artificial entities 
to represent the presentations themselves. 
What other information should be present in our document database?  Subject keywords would certainly be useful.  Since one document 
will generally have many associated keywords, we would introduce another relation, say docKeyword, to represent the new 
information.  Should keywords be entities?  Again there is a tradeoff, but the argument for entities seems persuasive:  limiting 
the range of keywords increases the value of the database for retrieval.  The keyword entities could also participate in 
relationships with a dictionary of synonyms, Computing Reviews categories, etc.
This is certainly not a complete design, and the reader is encouraged to fit his or her own ideas into the database framework.
CypressFig6.press  leftMargin: 1 in,  topMargin: 1 in,  width: 2.125 in,  height: .75 in
4.3 Example
The following program defines a small schema and database of persons and documents.  It illustrates the use of most of the 
procedures defined in Section 3.
DocTestImpl: PROGRAM
  IMPORTS DB, IO, Rope =

BEGIN OPEN IO, DB

tty: IO.Handle_ CreateViewerStreams["VLTest1Impl.log"].out;

Person, Conference: Domain;
Thesis, ConferencePaper, Document: Domain;
author: Relation;
  authorOf, authorIs: Attribute;
presentation: Relation;
  presentationOf, presentationBy, presentationDate: Attribute;
publDate: Attribute;
rick, mark, nori: --Person-- Entity;
cedarPaper, cypressDoc, thesis: --Document-- Entity;

Initialize: PROC =
  BEGIN
  tty.PutF["Defining data dictionary...\n"];
  -- Declare domains and make ConferencePapers and Theses be subtypes of Document:
  Person_ DeclareDomain["Person"];
  Conference_ DeclareDomain["Conference"];
  Document_ DeclareDomain["Document"];
  ConferencePaper_ DeclareDomain["ConferencePaper"];
  Thesis_ DeclareDomain["Thesis"];
  DeclareSubType[of: Document, is: ConferencePaper];
  DeclareSubType[of: Document, is: Thesis];
  -- Declare publDate property of Document
  publDate_ DeclareProperty["publDate", Document, IntType];
  -- Declare author relation between Persons and Documents
  author_ DeclareRelation["author"];
   authorOf_ DeclareAttribute[author, "of", Document];
   authorIs_ DeclareAttribute[author, "is", Person];
  -- Declare presentation relation
  presentation_ DeclareRelation["presentation"];
   presentationOf_ DeclareAttribute[presentation, "of", Document];
   presentationBy_ DeclareAttribute[presentation, "by", Person];
   presentationAt_ DeclareAttribute[presentation, "at", Conference];
  END;

InsertData: PROC =
  BEGIN t: Relship;
  tty.PutF["Inserting data...\n"];
  cedarPaper_ DeclareEntity[ConferencePaper, "The Cedar DBMS"];
  cypressDoc_ DeclareEntity[Document, "Cypress DB Concepts & Facilities"];
  thesis_ DeclareEntity[Thesis, "An Analysis of Priority Queues"];
  sigmod_ DeclareEntity[Conference, "SIGMOD 81"];
  rick_ DeclareEntity[Person, "Rick Cattell"];
  mark_ DeclareEntity[Person, "Mark Brown"];
  -- Note we can create entity and then set name...
  nori_ DeclareEntity[Person];
  ChangeName[nori, "Nori Suzuki"];
  -- Data can be assigned with SetP, SetF, or DeclareRelship's initialization list:
  t_ DeclareRelship[presentation,, NewOnly];
    SetF[t, presentationtOf, cedarPaper];
    SetF[t, presentationBy, mark];
    SetF[t, presentationAt, sigmod];
  []_ SetPList[cypressDoc, authorIs, LIST[rick, mark]];
    -- the Cedar notation LIST[ ... ] defines a list
  []_ SetPList[cedarPaper, authorIs, LIST[rick, mark, nori]];
  []_ DeclareRelship[author, LIST[[authorOf, thesis], [authorIs, mark]]];
  []_ SetP[cypressDoc, publDate, I2V[1982]];
  []_ SetP[thesis, publDate, I2V[1977]];
    -- the I2V[...] calls needed because Cedar Mesa does not yet coerce INT to REF ANY
  -- Check that thesis can't be presented at conference:
  ok_ FALSE;
  t_ DeclareRelship[presentation];
  SetF[t, presentationOf, thesis
    ! MismatchedAttributeValueType => {ok _ TRUE; CONTINUE}];
  IF NOT ok THEN ERROR;
  END;

DestroySomeData: PROCEDURE =
  -- Destroy one person entity and all frog entities
  BEGIN flag: BOOL_ FALSE;
  tty.Put[char[CR], rope["Deleting Rick from database..."], char[CR]];
  DestroyEntity[DeclareEntity[Person, "Frank Baz", OldOnly]];
  DestroyDomain[Frog];
  END;
  
PrintDocuments: PROC =
  -- Use DomainSubset with no constraints to enumerate all Documents
  BEGIN
  doc: -- Document -- Entity;
  authors: LIST OF Value;
  es: EntitySet;
  tty.PutF["Documents:\n\n"];
  tty.PutF["Title                   authors\n"];
  es_ DomainSubset[Document];
  WHILE (doc_ NextEntity[es])#NIL DO
    tty.PutF["%g          ", rope[GetName[doc]]];
    authors_ GetPList[doc, authorIs];
    FOR al: LIST OF Entity_ NARROW[authors], al.rest UNTIL al=NIL DO
      tty.PutF["%g ", rope[GetName[al.first]]]] ENDLOOP;
    ENDLOOP;
  ReleaseEntitySet[es];
  END;

PrintPersonsPublications: PROC [pName: ROPE] =
  -- Use RelationSubset to enumerate publications written by person
  BEGIN
  p: Person_ DeclareEntity[Person, pName, OldOnly];
  authorT: --author-- Relship;
  rs: RelshipSet;
  first: BOOL_ TRUE;
  IF p=NIL THEN
    {tty.PutF["%g is not a person!", rope[pName]]; RETURN};
  tty.PutF["Papers written by %g are:\n", rope[pName]];
  rs_ RelationSubset[author, LIST[[authorIs, p]]];
  WHILE (authorT_ NextRelship[rs])#NIL DO
    IF first THEN first_ FALSE ELSE tty.Put[rope[", "]];
    tty.Put[rope[GetFS[authorRS, authorOf]]];
    ENDLOOP;
  tty.PutF["\n"];
  ReleaseRelshipSet[rs];
  END;


tty.Put[rope["Creating database..."], char[CR]];
Initialize[];
DeclareSegment["[Local]Test", $Test, 1,, NewOnly];
OpenTransaction[$Test];
Initialize[];
InsertData[];
PrintDocuments[];
PrintPersonsPublications["Mark Brown"];
DestroySomeData[];
PrintDocuments[];
CloseTransaction[TransactionOf[$Test]];
tty.Close[];


END.