A Paradigm for Editing, Browsing, and Query in Database Applications R. G. G. Cattell, J. E. Donahue Computer Science Lab, Xerox PARC Palo Alto, California 94304, USA Abstract We describe a set of database tools and applications, and what we have learned from building and using them. Our applications follow a simple common paradigm for database access; we argue that this commonality presents the user with a uniform set of database facilities rather than independent programs for each application. We also show how a conventional relational viewpoint on data can be combined with a dual entity-centric, "object-oriented" viewpoint. Our user interface paradigm incorporates browsing, query-by-example, and forms-based entry. We use a database system incorporating extensions to the entity-relationship model, providing transaction-based access to data stored in servers on our personal computer network. We found that constructing interactive database applications that recover gracefully from transaction aborts requires careful software structure design. The applications and tools have been in use for over a year; we relate both positive and negative results of our experience. 0. Systems Environment In the Computer Science Laboratory at Xerox PARC, we have been working on the construction of a database system, a collection of database applications, a set of database support tools, and a powerful programming environment in which all of these systems run. This paper centers on our experience with database applications and support tools. A number of database applications are important for our personal computing environment. Electronic mail is a central component of the environment at PARC, where a large number of powerful personal computers are linked together by an internetwork of local networks and gateways. An electronic mail application is now in regular use, while its further development continues. Another application is an experimental replacement for the "whiteboards" that we have on our office walls. In this application, database entities and ordinary text can be displayed on the user's screen in the fashion of notes on a whiteboard; they can be moved around on the whiteboard to present just the spatial structure the user finds convenient. A telephone directory application has been developed, integrated with hardware to automatically dial telephone numbers obtained from a database. An electronic calendar application was developed to store and provide reminders for various events. Our paper is broken into five sections. In this section, we describe the user's workstation, our database system, and the software environment, to put our work in context. In the next section, we discuss some of the problems to be addressed in developing database applications in this environment. In Section 2, we present the user's view of our database applications, with example screen displays. In Section 3, our implementation of the applications is described in some detail. Finally, we summarize our conclusions in the last section. The programming environment, Cedar, will be described elsewhere [Teitelman 1984]; here we give only enough details of the underlying software and hardware to put our work in context. The typical Cedar user has a powerful personal workstation, a Dorado [Pier 1983], although Cedar also runs on lower-performance computers. The large-format display is broken up into a number of non-overlapping windows, some full-size and some iconic (the figures at the end of the paper give many examples of actual Cedar screens). A pointing device (a three-button "mouse") can be used to move a cursor around on the screen and to select text or invoke operations (clicking a "button"); the three buttons on the mouse are called the red, yellow, and blue buttons. Cedar provides a high-degree of concurrency in the operation of running programs; thus, the user can be reading his mail with one program while editing a file with another, while running the compiler on a third, etc. To allow this concurrency without making life hopelessly confusing, the Cedar screen allows input to only one window at a time; the "input focus" must be set (using the mouse) before the destination for type-in changes. The database system upon which our work is built is described in detail in [Cattell 1983, Brown et al 1981]. For the purposes of understanding this paper, it suffices to understand the two primitive data types the database system provides: entities and relationships. An entity represents a real or abstract object, such as a message, event, or person. Relationships represent information about entities or connections between them; they are records whose fields are primitive values (Booleans, integers or strings) or database entities. For example, relationships might specify a person's birthdate or spouse. Every entity has a type; we call a type of entity a domain. Entities also have unique names. Note that entities are atomic identifiers, not records as in [Chen 1976]; all information except the unique entity name is represented via relationships, as in the pure relational model [Codd 1970]. As in the relational model, every relationship has a type (in essence, the types of all of its fields); we call a type of relationship a relation. A Cypress database consists of a number of segments; basically each application has its own segment. Segments provide firewalls between applications: while an application can run amok and clobber the data in its own segment, it cannot make it necessary for all of the segments of the database to be reconstructed upon a failure. However, we pay something for this independence of segments. Applications that wish to share data must cooperate in sharing segments. Between-segment data references must be made via some unique externally-defined identifier, normally entity names. Cypress accesses data through transactions. However, transactions are not implemented as part of Cypress, but rather at the level of an underlying file server on the network. We have in fact used a variety of transaction-based file systems under Cypress. We are at present using Alpine [Brown et al 1984], as it provides better performance than the earlier file systems. 1. Application Development Issues In this section we describe some of the problems we set out to address in our work, and even more problems that we did not anticipate but nevertheless found it important to address. Integration Our first and most important goal was the integration and simplification of database applications, to use only a few easily-learned primitives. For a variety of reasons, software typically exhibits less integration than one might desire. We wanted to establish a higher common denominator between the parts without requiring much more design on the part of the programmers, by using common conventions for use and coordination of the display, database, and input. We recognize two kinds of "integration" in the context of database application: 1. Coordination in the use of the underlying database. This we will call "internal integration"; the various applications making up the current "database system" must be able to communicate with one another, e.g. so that an application can be notified if a portion of the database has changed. In a successful system, the user will be unaware of how this internal integration is performed; if it fails, however, he will be painfully aware of its absence. 2. A uniform set of commands for performing operations. This we will call "external integration"; it is what makes various applications look less like a collection of independent programs and more like different aspects of a single system. External integration is simplified by, though not impossible without, internal integration. We have observed two important characteristics of our applications that are important to the view of integration we have taken: 1. Applications require considerable flexibility in their display format. Thus, we have not designed a "forms editor" [Rowe and Shoens 1982] for developing application packages. Each application has been developed to display information in the format most natural for its own purposes. Thus a calendar system should be able to display months much as they appear on a wall calendar, while a message system displays plain text. When high-performance machines are connected to bitmap displays, it seems far wiser to use the medium of the display to present information in a familiar form, rather than forcing it into the contraints of some simpler or non-graphical format. Indeed, we might like the display layout to be specified dynamically by the user. 2. Applications must allow entities to be related to one another in fairly unstructured fashion, yet quickly be able to get to one entity from another. The applications are all replacements for common objects in our offices, so they have to have the same "unstructured" feel as our whiteboards, calendars and files of messages. A calendar system clearly needs to be able to associate messages with calendar entries. We would like to be able to easily "browse", e.g. from a message file to message to calendar entry. Concurrency Loose Coupling Cross-application References Transactions Display Update A change to a database may cause data displayed on the screen to become invalid. Handling these changes is particularly difficult in an environment of multiple windows and multiple processes. Although the user may have invoked the database change by a command upon some particular display window, the change may effect related data displayed in other windows. If the database change was invoked by another user (this is exhibited as a transaction abort), or spontaneously invoked by a background process such as automatic fetching of new mail, then the display update would have to be asynchronous with any actions on the user's part. This asynchrony might be confusing to the user. Finally, it may be difficult or undesirable to maintain consistency of all windows. For example, if a window represents the result of evaluating a complex query, we probably do not want to pay the cost of re-evaluating the query on every update, unless an adequate incremental evaluation scheme can be developed. We would like some simple mechansim by which different applications, and different parts of a single application, can coordinate in such a way as to give the user a consistent view of the database. It may be necessary to invent a new mechanism such as marking certain windows as potentially invalid. Reliability Some applications have a very high requirement for reliability: if a transaction aborts due to some failure or lock conflict, it may not be possible or convenient for the user or program to repeat the offending operation. An example is electronic mail. People would be quite unhappy if they lost some of their new messages. Some mechanism to log an redo operations is needed. Large Data Blocks Some database applications must deal with large blocks of data that have no structure as far as the database system is concerned, i.e. they are interpreted by the application. For example, images to be displayed on the screen may be stored as a single unit, as we do not perform database operations within that unit. Documents might be stored in a database as a single unit, also. Although Cypress has the capability to store large chunks of uninterpreted data, objects larger than a file page are currently better stored elsewhere to improve storage allocation efficiency. Thus, another important issue we must address is the handling of large data blocks. Alternatives are to extend the functionality of the database system, or provide some auxiliary common way for applications to store such data. 2. Database Applications and Tools Before proceeding to describe the implementation of database applications and our responses to the problems enumerated in the previous section, we will describe how our applications appear to the user. The examples we present will illustrate some of our points with respect to integration and concurrency, and set the stage for the solutions presented in the Section 3. The display paradigm Our application user interface paradigm is based upon three generic kinds of windows on the screen: displayers, editors, and queriers. A displayer window displays a database entity, provides a menu of commands upon it, and shows the information the database contains about the entity. For example, for an electronic message displayer, the message's header and body are displayed, along with commands such as Answer, Forward, and Delete. An editor looks much like a displayer, but allows modification of the information about the entity and usually has a different set of commands. For example, a message editor is created in response to a NewForm or Answer command in the Walnut system; the user fills in the fields and invokes the message editor's Send or File command. A querier also looks like an editor, but the user may fill in the fields with values, boolean expressions of values, or other application-interpreted information; the querier represents all entities in its domain which satisfy the given constraints. When the user invokes the Query command on a querier, the entities satisfying the constraints are displayed and may be browsed or printed. Each database application program implements displayers, editors, and/or queriers that are appropriate for that application. For example, the message system implements message and message-set displayers, message editors, and message queriers. Figure 1 illustrates a variety of types of displayers. Note how each application takes best advantage of the display for the particular type of entity involved. Figure 1a is a message displayer, implemented by the mail application. Figure 1b shows a picture displayer, implemented by a simple image application. Figure 1c shows a whiteboard displayer, used for spatial data organization. And finally, Figure 1d shows a "default" displayer which is invoked when no particular application knows how to deal with an entity of a given type (in this case, a person). We will return to the default displayer in Section 3. Figure 2 contrasts a displayer, editor, and querier for messages. We will refer to displayers, editors, and queriers collectively as "nuts". A nut always has a unique associated domain (entity type). If a nut is a displayer or editor, it is also associated with a single entity in the database. Most of the mouse buttons have common interpretations supported by applications. For example, the yellow button always means "open", regardless of application. Figure 3 a-d illustrates how the yellow button may be used to browse through entities in the database, even across application boundaries. This interpretation of the yellow button is also consistent with its common use in the Cedar system -- when the yellow mouse button is clicked over an icon, it "opens" to become a full-sized window. The interpretation of other mouse buttons is illustrated in Figure 4. The red button always means "select": a message in the message set in Figure 4a has been selected with the red button. The blue button always means "extend selection": the selection made in Figure 4a would be extended to include several messages with the blue button. If the red button and control key are both pressed, the object at which the cursor is pointing is deleted. This could be used to delete the message in Figure 4b from the message set in Figure 4a. If the red and shift buttons are both pressed, the object at which the cursor points is copied to the current input focus; this is useful for moving entities or text around. Thus each button, in combination with the shift and control keys, has a common primitive meaning across all applications. The complete interpretation of buttons is as follows. Button Interpretation Red Select the entity or other value at the current mouse cursor position Blue Extend a selection made by red to the cursor position Yellow Open a new displayer window for the entity at the cursor position Ctrl-Red Delete the relationship at the cursor position Ctrl-Yellow Expand the selected entity in place (rather than creating a new window) Shift-Red Insert the entity or string value at the cursor into the current input focus Shift-Yellow Open the selected entity as a full-sized window, closing other windows Others Currently application-defined Database applications Our first application of the database system was Walnut, a collection of facilities for sending, receiving, filing, and querying electronic messages. Walnut provides four kinds of windows visible to the user: 1. Control window: This window is the one that appears when Walnut is started. It provides menu commands to fetch new mail, to create a new message form (item 4 below) and to create a new "mail file" (item 2 below, which we call a message set). The control window also reports when new mail is available, or an error is recognized by the Walnut system. 2. Message-set displayer: This window displays a list of messages, their senders, dates, and subjects. There is a special "active" message set into which new mail is initially inserted. Yellow-selecting one of the messages produces an instance of item 3 below, a message displayer. A menu of commands at the top allows removing messages from the message set and/or adding them to other message sets. Figure 3b illustrates a message set displayer. 3. Message displayer: This window displays a message in the database. The menu of commands at the top include Answer and Forward, which produce an appropriately initialized instance of a message form (item 4 below). Figure 3c illustrates a message displayer. Figure 3d is the result of yellow-selecting the sender of the message in Figure 3c. 4. Message editor: This window displays a message form which the user can edit. Some of the fields will be initialized according to whether the message editor window was created by the "Answer", "Forward", or "New Form" commands. Figure 2b illustrates a message editor. Another application, "Grapenut", provides a database of people, organizations, phone numbers, and addresses. There are both public and private versions of the Grapenut "whitepages" directory; both are queried on user requests, thus appearing as if they were a single database. Another database application is Hickory, a Calendar / Alarm Clock application for recording appointments and reminding the user of them; it implements "event", "event-set", "day", "month", and "year" entity displayers. Hickory uses a hierarchy of event domains to define different types of events, such as meetings, seminars and trips. The user may enter particular events or sets of periodic events (e.g., a seminar every Tuesday at 2 o'clock). When the client views a particular day he sees the events scheduled for that day. Another application, "Whiteboards", provides functionality useful in conjunction with all of the other applications. The Whiteboard package provides a mechanism to organize arbitrary entities in a tree-structured, spatial dimension. The user may browse by moving up and down a hierarchy of "whiteboard" displayers whose ultimate leaf nodes are non-whiteboard entities. An example of a whiteboard display was given in Figure 1d. Message and calendar entities, or arbitrary text may be placed on whiteboards. Cedar program icons can even be placed on whiteboards, causing the appropriate program to be loaded and run when they are "opened" another day. Conventional text files can be placed on whiteboards; a Cedar text file viewer is created when the icon is opened. A database including several dozen hierarchically-organized whiteboards has been used to allow users to browse through Cedar documentation and run Cedar utilities. Thus our system provides two kinds of browsing: 1. "Ground-level" browsing in a network of relationships by selecting adjacent entities, e.g. selecting a day in a month to get a displayer on that day, or a message in a message set to display the message. This basic browsing functionality, illustrated in Figure 3, is provided in all applications and in the default displayer windows. 2. "Tree" level browsing by moving up and down a hierarchy of "whiteboard" displayers (an example of a whiteboard display was given in Figure 1d); this adds a vertical dimension to (1). We call this "tree" level browsing, although strictly the hierarchy could be a cyclic directed graph. General Database Tools In addition to the specific applications we have described, we implemented a general-purpose application, "Squirrel", providing: 1. The default displayer, editor, and querier windows. These are useful to an application-builder for debugging, and to an end user for maintaining and examining databases for which no application has been built (we have used them for address books, wine cellar inventories, etc.). 2. A facility to dump all or part of databases to disk files in an application-independent human-readable textual form, and to load databases in the same format. 3. A facility to erase all or part of a database, e.g. in preparation for reloading it from a dump file. 4. Global functions on databases, e.g. opening and closing database transactions. 5. Displayer and editor windows for data schema entities, i.e. the data types. These allow the actual types of entities and types of legal relations to be modified. Existing data is automatically updated to conform to the new structure. The Squirrel window on the screen allows the user to invoke these operations; it appears as in Figure 5. The buttons at the top of the window provide the following functions: 1. Reset: aborts the current database transaction. 2. Save: commits the current database transaction. 3. Open: opens the new database specified in the Database field. 4. Close: closes the current database. 5. Dump: dumps all or part of a database to a disk file. 6. Load: reads a disk file produced by Dump into the current database. 7. Debug: bypasses the registration scheme, causing Squirrel rather than application displayers to be used. The current set of registered applications is remembered and restored upon applying Debug again. 8. Erase can be used to erase an entity, domain, relation, or an entire database segment. The three buttons labelled Display, Edit, and Query create new displayer, editor, and querier windows respectively. Display and Edit require that both the Domain and entity Name fields be specified. For Display, the entity must already exist. Query uses only the Domain field. An instance of the default entity displayer was illustrated in Figure 3d. It shows the domain and name of the entity displayed at the top of a window, and all the relationships in the database which reference that entity are shown as the main body of the window. The relationships are displayed in the form relation attribute1: value1 attribute2: value2 ... attributeN: valueN where relation is the relationhip's relation and the attribute: value pairs specify the values of its attributes. The attribute which references the displayer's entity is not shown, as it is normally redundant (all of the relationships in the displayer for an entity have at least one attribute which reference the entity or they would not be displayed). The other attributes are displayed in the obvious way for string, integer, and boolean values. For entity values, the name of the entity referenced is displayed. As with other displayers, the user may select an entity value with the yellow button on the mouse, causing a displayer to be created on the screen for the selected entity. This selection with the yellow-button provides the basis for browsing. The default editor window also shows the domain and name of the entity being edited at the top of the window. However it differs from the default displayer in that a form is provided in the body of the window for the entity, showing not only those relationships which already reference the entity, but "blank" relationships for relations which could potentially reference an entity of its domain. A "blank" relationship appears as a relationship in a displayer, but the values are given as blank fields that can be filled in by the user by selecting the blank with the red mouse button and typing. If the editor window is on a new entity (one that previously did not exist in the domain), then all the relationships shown will be blank ones. The relationships shown in the main body of the default editor window are followed by a Notes section allowing a paragraph to be attached to an entity, and a Related Entities section allowing a mini-whiteboard of related entities to be attached to an entity. The editor window also provides commands to change the name of an entity or merge it with another one. The default editor provides other convenience features, e.g. to optionally create entities of the appropriate types if they are entered into the form but do not exist. The default querier window has a form similar to an editor window, except that only a domain name is shown at the top of the window. The querier window, unlike an editor window, represents any number of entities in a domain which satisfy a query. The query is specified by filling in the form fields with expressions that specify the desired values of the fields. After the form has been filled in and the user uses the window's Query command, the querier opens a new window on the screen displaying the names of the entities which satisfy the query. The client may scroll through the entities, or select one of them with the yellow mouse button to open a displayer on it. Note that except for the form in which the answer is examined, this querier window is a simple form of the Query-by-Example system [Zloof 1975]. In addition to displayers and editors on ordinary data items, Squirrel implements special displayers and editors for domains and for relations. Thus the user may examine or modify the data types as well as the ordinary data. The displayer for a domain shows the attributes in the data schema which can reference an entity from that domain, the sub-types and super-types of that domain, and all of the entities of the domain sorted by name. The user can scroll or yellow-select the entities. The editor for a domain allows the user to change the sub-types and super-types. The displayer for a relation shows a table whose columns are labelled with the names, types, and uniqueness constraints for the relation's attributes (see Figure 1c). The rows of the table show all of the relationships in the relation. Again, the user may scroll through the relationships, and may yellow-select entity-valued entries to open displayer windows. The editor for a relation shows only the attribute names, types, and uniquenesses, and allows the user to change these. The user may also delete attributes of the relation or create new ones (the relation editor will automatically copy all the relationships into a new relation with the same name, since the underlying database system does not allow changing attributes of a relation after relationships exist). 3. Implementation High performance personal computers demand systems that exhibit a high degree of concurrency. Cedar users become annoyed with the synchronous, serial, operations found on earlier systems. The internal integration of a concurrent system is a particularly difficult task. For instance, in Cedar there are several files whose contents establish important properties of the system: the file "User.profile" includes information about the name of the printer to be used when a document is printed and whether Cedar is to go into "idle" mode (putting up a blank screen) after some time without user actions. An obvious problem for the system designers is orchestrating the notification of all of the relevant programs of changes in these files -- one doesn't want to have to reinitialize the system simply to get the user executive or editor to respond to a different set of commands (a change that can be simply made by editing User.profile). One of the advantages of using a database to communicate among applications is that there is now a common repository for changes to be registered, avoiding the problems of having to establish separate communication protocols among a large number of programs (requiring the editor to know, for example, that if the extension on the name of the file to be stored is ".profile" then the user executive needs to be notified). Instead there must be a simple way for applications to tell the database system that they are to be notified when changes are made. Since databases are large and may be shared, this cannot be simply "tell me when any changes are made" but "tell me when changes are made to that portion of the database of interest to my application." Providing this capability to database applications is one purpose of the Squirrel package. Below, we discuss how Squirrel fits into the larger Cedar system and how it is used by applications to provide the necessary level of internal integration. Cedar and its Components The following are the major components that a database application can use as a base for development: 1. The Cedar system itself. Cedar includes a text editor, so that an application can deal with text just by setting or reading the contents of a window on the screen. The window manager package provides tools for building fairly complex displays (like those seen in the figures at the end of the paper); a Cedar application can easily get a new command menu put on the screen or make a window in which the user can enter new text. 2. The Cedar database system, Cypress. One important point about Cypress that makes it a particularly good tool for communicating among separate applications is that a Cypress database is self-describing; the database schema may be queried just as any other data. Thus programs which deal with databases do not have to "hardwire" the database types and structure into their code. 3. Squirrel, running on top of Cedar and Cypress. The Squirrel kernel is a package of useful procedures for use by client applications. Squirrel also provides the general-purpose database tools to the end user described in Section 2. In this section, we will discuss the Squirrel kernel. It is worth pointing out using a common text editor with multiple windows provides a (rather weak) way of connecting applications; one can at least select text from one window and stuff it into another window. This can be useful when one application does not want to be too closely connected with another. For example, text announcing a meeting in the electronic mail application could be copied into a description of the event in the electronic calendar application. Squirrel Kernel Functions The external integration of database applications is made possible by their coordination through a central manager for the displayer, editor, and querier "nuts" the applications implement. We call the nut manager Squirrel. We describe the internal interface Squirrel provides to application programs here. Applications communicate through this interface, by registering themselves as the application responsible for the displaying or editing or querying of certain domains. Displaying, Editing and Querying A basic principal of Squirrel is that each application is responsible for certain domains of entities in the database; when an entity is to be displayed or edited, or when a database query is to be performed on a particular domain, Squirrel passes on the request to any application that has registered itself for the domain. Applications register themselves and call other such procedures through operations in the Nut interface. (Interfaces are the connectors between programs in Cedar; a program may import an interface to use the operations that an implementor of the interface exports.) Applications register themselves to provide the displayers, editors or queriers for a particular domain through the Register operation Register: PROC domain: ROPE, display: DisplayProc _ NIL, edit: EditProc _ NIL, query: QueryProc _ NIL, create: CreateProc _ NIL, update: UpdateProc _ NIL, transaction: TransactionProc _ NIL ] The Register operation takes as its arguments the name of the domain (ROPE is the name for "string" in Cedar) and a list of procedures that are to be called when 1. entities of the domain are to be displayed or edited or when a query is to be performed on the domain (the DisplayProc, EditProc and QueryProc), 2. new windows are to be created for displaying entities from the domain (the CreateProc), 3. the database domain is updated (the UpdateProc), and 4. a transaction open, close, checkpoint, or abort occurs (the TransactionProc). If any of the display, edit, query or create arguments are left unspecified, then a default procedure is supplied by Squirrel. Also new registrations supercede old ones, so the operations registered for a domain can change during the execution of an application (this is useful when debugging a new application). The display and edit operations take as arguments the entity to be displayed or edited, and the window (Viewer) from which the displayer or editor is being spawned. DisplayProc: PROC[e: Entity, parent: Viewer] EditProc: PROC[d: Domain, eName: ROPE, viewer: Viewer] The only difference between DisplayProcs and EditProcs is that DisplayProcs take a database entity as an argument, while EditProcs take the name and domain of an entity as arguments; this allows an editor to handle the case where the entity does not yet exist in the database. The display, edit and query procedures that an application registers are invoked through Display, Edit and Query procedures. These procedures call either a registered procedure or a default one if none has been registered; thus, applications need know nothing of other applications that are running in the same environment. To display an entity, an application simply calls Display: PROC[e: Entity, parent: Viewer_ NIL] RETURNS[new: Viewer] If an application is currently running that has registered a special display procedure for the domain, it will be called; if not, then the default displayer will be used. An application's display procedure creates a new window on the screen displaying information about the entity e. The application may choose to put the new window in a particular place on the screen, or replace an existing window with the new one. In order to avoid proliferation of viewers as a user browses and reduce confusion, we've established a convention when an entity e is opened by the user from the display in window parent: 1. If an entity has been spawned (selected to open) from parent before, and the window thus created is still on the screen, we re-use the latter window for the new entity display. 2. Otherwise, we create a new entity displayer window on the screen for the selected entity. These rules result in a hierarchical stack of entities as the user browses, convenient because it is easy in Cedar to flush old entries on the stack. We may experiment with other conventions, however. Updating the database To present a consistent view of the database to a user, an application may need to change its display to reflect a new state of the data; of course, when several applications share a database segment, an individual application cannot know when such changes of state have occurred. Even within a single application, there may be a need to communicate updates between procedures handling each type of window. We introduce an Update procedure to handle this function. Not all changes to the database require cross-application notification, so an application must call Update to announce a change in the database that is of possible importance. Update: PROC[updateType: UpdateType, tuple: Relship] where UpdateType: TYPE = {create, destroy} and tuple is a relationship of the database. Everyone who has registered an update procedure for a domain referenced by the tuple will be notified of the change, to allow them to update their current display window (or anything else they may wish to do). The standard Cedar viewers package provides facilities for an application to retrieve handles for its display windows, e.g. to update them. Starting/Stopping Applications Another time when applications need to communicate with each other is when a transaction is aborted, opened, or closed. For example, some applications fork processes that continue to run waiting for the database transaction to be re-opened, e.g. when Cedar is idle. The notify procedure supplied when an application registers itself is called when such an event is about to occur. TransactionProc: TYPE = PROC[segment: Segment, type: TransactionOp]; TransactionOp: TYPE = {open, close, abort} Transaction notification procedures are useful, for example, in allowing an application to rebuild caches or displays after an open or abort. The original need for the transaction notify procedure was largely because our database system does not yet implement multiple transactions for one user on the same database segment (only for shared segments). However it sometimes desirable for multiple applications to cooperate in the use of a transaction even when multiple transactions are available, because a set of updates represent a single logical user action; thus our notification procedure may remain even after our database system's limitation has been remedied. We now discuss some of the integration problems in particular applications, and their solutions using the Squirrel kernel. Walnut Walnut is responsible for entities in two database domains, messages and message sets. Although Walnut is relatively self-contained (the operations it provides involve only messages and message sets), there are several aspects of Walnut that demand cooperation with other database applications; we will focus on only two of the most interesting of these. Because storing long strings in the database is expensive (and because mail users would be quite unhappy about lost messages), Walnut keeps a log of message bodies read and Walnut commands performed. In fact, the Walnut database segment can be completely reconstructed from the log. Walnut also stores information about the log in the database, so that an incremental recovery can be performed when a crash occurs. Since not all applications require the same "fail-safe" reliability of Walnut (less data is entered between commits in the calendar system, for example, so a crash is less serious), it is unreasonable to provide a general logging facility that everyone must use. The performance overhead would be too high. However, for applications to build their own logging mechanism, it must be guaranteed that they will be notified of every transaction open and close so they can assure consistency between their database segment and their log. One of the problems of integrating database applications is deciding who is responsible for getting things started and stopped. In our scheme, any procedure can open or close a transaction and know that other components will get a chance to clean up. Whiteboards Whiteboards are almost entirely a study in application integration. They provide integration not only between database applications, but with the entire Cedar system: any Cedar icon can be placed on a whiteboard. Since Whiteboards may be applied across application boundaries, they have been incorporated as part of Squirrel. Whiteboards work as follows. Any window in Cedar can be made iconic; the icon used can be a fixed bit pattern (a 64-bit square) or can be produced by a program (for example, one of the icons on most displays is a clock that is updated each minute). The icons, which appear in one or more rows at the bottom of the display, can be selected; once selected, a small set of operations can be performed, e.g. deleting, moving, or opening to full window size. The menu on whiteboard windows includes the command "AddSelected," which takes the currently selected icon and adds it to the whiteboard display (and changes the database to reflect the addition). The Whiteboard package is an interface between the world of Cedar windows and the world of database entities. In fact, the Whiteboards are also allowed to contain Cedar objects like text files and the control windows of running programs (like Walnut) that are not usually entities in a database. Squirrel automatically creates database entities to represent text files or programs when an attempt is made to add one to a whiteboard; when one of the entities is "opened", the corresponding file is displayed or the program is loaded if it is not already running. Squirrel provides a RegisterTool procedure by which an application program can put into the database the necessary instructions for loading a program when the corresponding entity is selected: RegisterTool: PROCEDURE[name, impl]; where the name is the name of the window (names are displayed at the top of a window in a "herald line", and impl is the name of the program file implementing the tool (what has to be loaded if the user tries to run the program from a whiteboard). Thus, one can store in the database any Cedar entity that can be displayed on the screen by any program. A very important use of this is to build whiteboards that contain pointers to frequently used files, so that one does not have to perform long searches for "something that contains *Mumble*.bcd". Hickory Hickory, the calendar system, manipulates a number of types of entities: Days, Months, Years, Events and EventSets. Automatic update is a particularly important property for Hickory; people expect to always see up-to-date versions of their calendars. This means that editing an Event (adding it to the database or changing its time) may cause many other windows on the screen to change. It is possible using Cedar to allow the module that performs the redisplay of calendars on update to directly access the code that causes calendars to be displayed. However, using the Squirrel Display and Edit procedures, this is not necessary; an application can also reference other parts of itself using Nut.Display and Nut.Edit. And there are good reasons for doing this. If all calendars are displayed through calls to Nut.Display, then we don't have the problem that any (even cosmetic) change in the code that produces the calendar displays can cause recompilation of large pieces of the system. Using the Nut interface thus makes it possible to treat a large application as itself being composed of independent parts that are integrated using the Squirrel tools. The other main problem that Hickory faced was that it is not at all self-contained; users want to be able to associate messages (and other database entities) with their calendar entries and with the calendars themselves. Thus, we adopted the same user interface as Whiteboards for putting a database entity on an event display (or on a calendar). Grapenut Grapenut, our database application dealing with people, phone numbers, addresses, and organizations, was relatively simple to implement by leaning on existing facilities in Squirrel. The default displayers and editors, for example, are adequate for displaying people and organizations in the Grapenut database segments. A special displayer for separate phone lists seemed useful, however, since these can be displayed in a compact multiple-columnar form. The most unique feature of the Grapenut application is its use of layered databases. A public whitepages database was built containing all Xerox Palo Alto employees. A typical user will also have his own private databases of phone numbers and other information about people he knows. There may be some overlap between this private database and the Xerox database. We may construct other databases of phone numbers, for example of computer scientists we all frequently contact. Clearly, we would like a convenient way to query all of these separate database segments as if they were a single database. We implemented a simple extension to Cypress as a layer on top, in which query operations return entities and relationships in all segments rather than a single segment, the normal operation which was quite adequate when each application had its own single segment. One of the open segments is specially distinguished as the segment in which database update operations are made. Normally this will be the user's private segment. Thus we provide a many segment read, and single segment write capability (see [Cattell 1983] for more detailed discussion of this idea and issues involved). All the database segments appear almost as a single database to the user for most operations, but not quite: the user would sometimes like to know which information came from where, so we use different windows for the segments. Although Grapenut does not implement special displayers or editors, it provides commands in the Cedar executive to search the database in a convenient way. When the user types a name to be searched for, Grapenut: 1. Looks for a person whose name matches the name, or some prefix of it, in the private or public databases. 2. Failing that, it looks for a person whose log-name (a single user identifier used for electronic mail, and all authentication purposes) matches the name. Again, both the private and public databases are searched in that order. 3. Failing that, it looks for people whose name sounds similar to the one the user typed. The soundex code suggested by [Knuth 1969] is used, and soundex codes have been precomputed for the first and last names in a database relation. 4. Observations and Conclusions We have discussed the concepts of internal and external integration in a family of database applications implemented on a powerful personal computer. To date our databases contain electronic messages, people, phone numbers, calendars, pictures, spatial data layouts, and miscellaneous less structured data. We expect to learn more about the integration of applications as our experience continues; however we can already make some conclusions from our implementation. The internal integration provided by communicating through the database system and a centralized application manager have not only simplified the task of building applications, but naturally make them more externally integrated. Few lines of code are directly concerned with handling the internal integration, but the commonality has a large impact. The common code has been redesigned more than once as we find better ways to integrate the parts. We allow applications a fair amount of freedom in the format of their display and interaction. However, a few common primitives in the interface have had a large impact on external integration. Just as the command "delete" could be unified across a variety of applications using a conventional display, a common "delete" mouse button can provide the same function across a variety of applications in a window-based environment. The user thus has less to learn when dealing with a new application, and is less apt to be confused by different conventions for what is logically the same operation viewed at a higher level. The authors conjecture that productivity is increased by the ability to map operations onto actions at a level the user can exercise without thinking about it. Sharing a database system provides other benefits. General-purpose tools can be used to browse, query, or edit a database. An application need only provide specialized tools for these functions where there is a significant benefit from doing so. Or, the general tools may serve as temporary stopgaps during the development of new applications. Note that we have generally encoded knowledge of the semantics of an application's data schema in application programs, providing a powerful user interface as a result. For example, a telephone directory application may understand the syntax of 7-digit USA phone numbers and area codes, and provide automatic defaults and error checking. To the extent that knowledge may also be encoded in the database itself, however, it becomes possible to provide a powerful interface without ad hoc code for each application. This is not always the most practical solution, but is quite useful when data from a number of applications have analogous semantics. The utility of our general-purpose tools was enhanced by the fact that our database system provides a higher-level data model, encoding more semantics of the data than most database systems. We found the general-purpose Squirrel facilities quite adequate by themselves for some kinds of database applications, such as personal databases, addresses, or wine tastings. The default Squirrel displayers and editors have also proven useful to database application programmers, to examine and modify a database in the process of testing and debugging application programs. Our research was greatly simplified by the use of high-performance personal computers, allowing us to concentrate primarily on functionality rather than speed or space in our experiments. The primitive operations, such as displaying a message in a message-set or deleting an entry on a whiteboard appear essentially instantaneous on our machines despite the fact that flexible general-purpose graphics and database packages are invoked. We believe, however, that except for inherently expensive operations such as complex queries, our simple interface could be implemented with quite adequate performance on any personal computer with a bitmap display. There are several areas where we see the need for future work, either by ourselves or others: 1. We continue to find better or more complete ways to externally integrate applications, as we note analogies between their functions. For example, many applications deal with sets of objects, and more uniform conventions for dealing with sets are desirable in our system. The spatial organization and manipulation of objects also deserves further experimentation. 2. Although we have learned a fair amount from our current few applications, there are many other kinds of data a user could manipulate in a database. We anticipate that other applications will uncover new problems and possibilities, and will also further increase the payoff of integration, simply in scope of the commonality. 3. Our database system does not yet provide multiple atomic transactions on the same machine for the same database segment, although data may be shared across multiple machines. In order to properly interface applications sharing data on a single machine, we plan to enhance the functionality of the database system. 4. Our database system does not yet provide a general query processor, instead requiring that the bulk of query processing be written in the application code. The demand for such a query processor will expand as the desire for more complex queries increases. 5. References Cattell, R. G. G. "Design and Implementation of a Relationship-Entity-Datum Data Model", Report CSL-83-4, Xerox Palo Alto Research Center, Palo Alto, CA 94304, April 1983. Chen, P. P. S. "The Entity-Relationship Model: Towards a Unifeid View of Data," ACM Transactions on Database Systems 1, 1 (March 1976). Codd, E. F. "A Relational Model of Data for Large Shared Data Banks," Communcations of the ACM 13, 6 (June 1970). Shoens, K., Rowe, L. "A Form Application Development System", Proceedings ACM SIGMOD Conference, Orlando, 1982 Swinehart, D., Stewart, L., and Ornstein, S. "Adding Voice to an Office Computer Network," Proceedings IEEE Communications Society GlobeCom Conference, 1983. Zloof, M. "Query By Example", First International Conference on Very Large Databases, 1975. Figures Figure 1. Each application implements displayer windows for one or more types of database entities. Note that the type (domain) and name of the entity displayed is shown in the black bar at the top of each window, followed by a menu area in which commands appropriate to the particular type of entity are shown, followed by a display of information about the entity (tailored to the type). Figure 1a. A message displayer, implemented by the electronic mail application. Figure 1b. A picture displayer, illustrating graphic flexibility of the window display. Figure 1c. A relation displayer. Shows the attributes of the relation as columns, and the relationships in the relation as rows. Relations and domains are entities, and can be examined just as any other data. Figure 1d. A "whiteboard" displayer, providing a mechanism to spatially organize and browse through entities. Note that icons appropriate to the entity type (place, person, message set) are shown. Figure 1e. A "default" displayer, invoked when no application has registered itself to deal with entities of the type (the Organization domain, in this case). Figure 2. Independent of domain, three types of windows may be implemented by an application: displayer, editor, and queryer windows. Here we illustrate the three kinds of windows for the message domain. Figure 2a. A message displayer, allowing message examination and browsing. Figure 2b. A message editor, allowing creation of new messages to be sent or filed. Figure 2c. A message queryer. User fills in properties to search the database for matching messages. Figure 3. The user may browse through successive displayers by using the "mouse" pointing device to indicate entities he would like to see. In this succession of frames (Figures 7-3a through 7-3d) he has browsed from a whiteboard to a message-set it contains, to a message in that set, to the sender of the message. Each of these displayers is implemented by a different application, except for Figure 7-3d which is an instance of the default displayer. Figure 3a. A whiteboard displayer, showing a number of entities in iconic form. Figure 3b. A message set displayer, selected from the whiteboard in Figure 7-3a. Figure 3c. A message the user selected from the message set in Figure 7-3b. Figure 3d. The person who sent the message, selected from the message in Figure 7-3c. Figure 4. The user may use various combinations of the mouse buttons and keys to invoke the same operations, regardless of the database application involved. The yellow button was used for browsing in Figure 3; e.g. the message in Figure 4b could be obtained from the message set in Figure 4a in this way. Note that the message selected in the message set is "grayed" to make it stand out. More than one message may be grayed using the blue button to extend the selection to other messages. Messages may be deleted from the set by pointing and using the control key and red button. If a message is deleted directly by the DeleteMsg command in the message displayer, all displayers on the screen are automatically updated. In this case, the message (Figure 4b) would then disappear from the message set in Figure 4a. Figure 4a. A message set displaying one line per message. Figure 4b. A message in the message set of Figure 4a. Figure 5. A whiteboard displayer with the "ShowLines" feature turned on; lines are displayed for relationships between entities on the whiteboard (very few of our relations have more than two entity-valued attributes, as they are normalized for simplification). When the user selects an entity with control-yellow, the system places related entities on the whiteboard, with a heuristic algorithm for their placement. The user can thus browse in graphical space. Figure 6. The Squirrel control window. This allows a user or programmer to invoke application-independent database operations, such as aborting a database transaction or dumping all or part of the database to a text file. <<>> <<>> <<>> <<>> <<>> <<>>