1018. Status and conclusions8.1 SummaryThis document has described the Model level of the Cypress Database Management System,including the data model, the background and motivation for choosing that model, the clientinterface to the system, its implementation, and the database environment and applications for whichit was implemented. The guiding principles in the design and implementation have been simplicityand utility for the set of applications we envision.The result is a model that includes features of the relational model and distillations of desirablefeatures of more recent semantic models. The model includes the concept of entities with uniquenames, a hierarchy of types of entities, types and uniqueness constraints on relation attributes,relational views, and a logical segmenting mechanism that can be used to facilitate physicaldistribution and independence of databases. This report discusses a number of issues in theimplementation of these features, which are not present in any existing database system to theauthor's knowledge. The Cypress data model alleviates the problems motivating its development,reducing the quantity of data modelling mechanism built anew for each application, and simplifyingthe sharing of databases between applications. The Cypress model has also enabled the developmentof general-purpose tools formerly impractical due to the lack of type information and integritychecking in the Cedar database system.8.2 Some resultsSome overall statistics on the performance of the initial implementation may be helpful here. Wedeveloped two benchmark programs, one write-intensive and one read-intensive, to examineperformance. Average times for the most common operations are roughly: 1 ms: GetF 1 ms: NameOf 0.5 ms: NextEntity 0.5 ms: NextRelship 5 ms: DomainSubset 7 ms: RelationSubset10 ms: SetF10 ms: ChangeNameThese times are approximately in order of decreasing frequency of calls by the benchmarks, andFfp ^q Xxr TpE RhI P4L MN K4 GT EP CF ARE ?U <Z :L 8V 6KJ 41. 1& +r 'pW %|X #GGk 7 e1  > >Q\2DESIGN AND IMPLEMENTATION OF A RELATIONSHIP-ENTITY-DATUM DATA MODEL102include overhead at all levels of the database and file systems. The times were taken on the XeroxDorado, a personal super-computer with a micro-cycle time of about 60 ns. Note that since most ofthe Cypress operations are disk-limited, the times increase only somewhat for slower processors. Thetimes shown above vary widely with a particular application's data schema and access patterns, sothese numbers should be regarded as very rough averages. Some effects of particular optimizationsand schema changes were enumerated in Section 6. A significant result of our work is that the type checking required by the data model is not a largeoverhead in the Cypress implementation. Because we do not compile database accesses, we mustcheck in the implementation of every operation, e.g. SetF, that the arguments passed are of theproper and coordinated types. On a SetF, for example, we must check that: (1) the arguments are arelationship, attribute, and value, respectively; (2) the attribute is of the same relation as therelationship; (3) the value is of the same type as the attribute; and (4) that a key value constraintwould not be violated by the new value. The caching of information about attributes improves theperformance of the first three of these considerably. Without this caching, the GetF operation takesapproximately 8 times as long.A closer analysis of the time spent in a typical read operation, e.g. GetF, is enlightening. For ourbenchmark programs, the time breakdown was roughly as follows: 10% model level consistency checking and access path selection 20% storage level operation: actual read or update of data 50% waiting for disk operation (cache miss) 20% other overhead (page faults, garbage collection)Again, these proportions can vary widely with the particular application.8.3 Status and plansThe first implementation of the Model level was completed in December of 1981, and was exercisedand debugged through 1982. Approximately six man-months went into its development. Thisimplementation includes essentially all data model features except views and augments. Views havebeen deferred to the development of the Query level.Plans for the near future are to concentrate on the development of more applications, continuing thework sketched in Section 7.vfpsF psps pspspXsFps Mp _A" \4. ZK XxB VD8* T2 P4Etp MF K S Ia Gb^ E-6/ BA @e > :#B 8>4?2p<0;,.5 *+I #r pH U ;' [4 B" K > M>QUSTATUS AND CONCLUSIONS103AcknowledgementsNori Suzuki and Mark Brown participated in the design and implementation of the original CedarDatabase System. Peter Deutsch and Jerry Popek assisted in the design phase. Eric Bier assisted inthe initial implementation of the Model level, as well as providing a useful sounding board for theseideas. Willie-Sue Haugeland co-implemented Walnut. Jim Donahue developed Hickory. Willie-SueHaugeland, John Maxwell, and Jim Donahue have all helped with the Squirrel system. Mark Brownhas maintained and elaborated the Cypress Storage level and was simultaneously the central designerand implementor of the Alpine file system which Cypress depends on for remote data storage. Samuel Feldman, Dennis McLeod, Jim Donahue, Mark Brown, John Maxwell, Butler Lampson,William Kent, Ken Keller, Dushan Badal, and Peter Deutsch provided useful feedback on drafts ofall or part of this document as it evolved over last year. Kathi Anderson and Subhana Menis helpedprepare this report for publication."fpsF ps Fp _r [:pX YS VU T8' Rh1- P4F MF J#F G_ E] C$ C?>Q$m TIMESROMAN  TIMESROMAN  TIMESROMAN  TIMESROMAN TIMESROMAN 7 j/VModelLevelDesign8.bravoCattellMay 17, 1983 2:27 PM