[Indigo]<Yggdrasil>OCMDocBaseCom>DocBaseComReport.tioga!1

September 17, 1987

XEROX 3

------- length: 1 in

1 Did not comment on the final report since they were away on travel

Table of contents

§1 Executive Summary

§2 Introduction

§3 Summary of Needs and Options

§4 Conclusion

§1 Executive Summary

PARC's database and document storage requires different strategies for the short, medium, and long term (less than one year, one to three years, and more than three years respectively). It is in the medium term where we see the greatest need and point of leverage.

In the short term, PARC should improve the capacity, reliability, interoperability, maintainability and performance of our current systems. These needs are being studied by groups in each of the laboratories; they will recommend development of software and purchase of appropriate hardware to extend our current servers. While PARC will get by with it current systems, none of them will evolve into a viable medium term solution.

If PARC is to effectively address Xerox's corporate document strategy over the medium term, we believe that PARC needs a very large storage base that directly supports storage and manipulation of documents as hypertext. Because no available commercial or experimental system meets these needs, we propose initiating a new project, a Hypertext Service, to build a system available throughout PARC. This requires:

the purchase of a optical disk jukebox for massive storage (¤120K " ¤200K)

possibly the purchase of index server software, a RDBMS system, or file server software (¤0 " ¤50K)

the purchase of a computational and communications platforms (¤40K magnetic disks plus cost of Misc boards, server computers, and additional communications equipment)

building a group of 5 to 7 qualified staff to implement the service. This service should be delivered within the next 2"3 years.

We suggest investing a modest effort (with a strict time/effort limit) in exploring possible commercial and academic partners for this project.

In the long term, we must go beyond passive storage of document structures, and provide in our document servers such extensions as persistent processes, integration with foreign databases, and an extensible, distributed execution model of objects shared by groups. While the Object Service project in SCL is exploring many of these issues, it is intended as an exploratory platform and not a stable storage component in the medium term. Hence, this project should not be burdened by the addition of the requirement to satisfy our medium term storage needs. We expect that from the ideas developed in the Hypertext Server and Object Service projects we will better understand how to evolve our document base.

§2 Introduction

This report is the output of a committee formed by the OCM at PARC. It articulates PARC's needs for a document storage base and a strategy for meeting them. This section states the assumptions for our analysis.

f PARC's document storage needs are unique at this time

While the office computer is only now being viewed as more than a word processor, we have been building our understanding of computer aided work for over a decade. Successful commercial tools tend toward one of two extremes: highly interactive, small scale, single user tools with no real notion of structured data or sharing of information (e.g. Hypercard); or large DBMS systems with little flexibility and poor interactivity. PARC's needs are between these extremes, and we think that any system that satisfies our needs would be novel and interesting. Novel, because there are no systems that would satisfy them, either as a foreseeable commercial product or as the result of any current research project. Interesting, because our current needs will match the typical office systems needs in a few years.

f We need more storage than is currently available at PARC.

Electronic storage at PARC is insufficient in size and kind for existing applications.. In addition to normal growth, new projects involving voice and video will make a step function in the rate of this growth. For example, there is one image file currently being used at PARC that is 160 megabytes.

f We need a storage hierarchy

We need an effective storage hierarchy that supports immediate access to large amounts of data, and secondary long term storage that allows relatively rapid access to a much larger store. Existing file servers at PARC (such as Alpine and IFS) can neither support this themselves nor act as the underpinnings of a document storage base. They do not provide the requisite performance, capacity, or reliability. We need to investigate mechanisms for using both write"once and read/write technology. We need to look at current commercial systems such as FileNet and MegaDoc; these manage a disk layer on top of an optical disk jukebox with a total capacity of up to 500 gigabytes. (This is about ten times the total distributed capacity of all the computers in PARC)

f We need a common storage base used by all of PARC.

The storage base should serve as a common storage base and communication medium for all of PARC. The model(s) supported, and the access available should serve the medium term needs of all of the laboratories. If successful, an organization outside of PARC should be used to support replication of this system for other Xerox organizations.

f We must support a hypertext data model

While traditional file servers may be the best solution for storing collections of unstructured documents, the documents we envision have structure and form. To exploit this structure, a document storage base should understand something about the information it holds, i.e. it should present to the client an appropriate model. A hypertext model seems to meet the currently understood needs of all the laboratories.

f Our notions of a document will evolve over the next 10 years.

Our document storage plans must account for "documents of the future". The Hypertext Service and the Object Service projects will both provide the basis for future models and systems.

§3 Summary of Needs and Options

§3.1 Short term (< 1 year)

§3.1.1 The current situation

Most researchers still rely on IFS to store their data, but there are a number of other systems in use. SCL has purchased a commercial database system (Oracle); there are a growing number of SUN workstations that implement their own distributed file system; ISL is expanding its use of Xerox product file servers, and has acquired a number of Symbolics workstations that also implement their own distributed file system; CSL has built a prototype database server (Alpine).

The approximate capacity of these server storage systems is presented in the following table:

System type Capacity (gigabytes)

IFS 10.5

Product file servers 3.9

SUN NFS 4.8

Alpine 3.0

others (Symbolics, &c) 0.7

total 22.9

Additional magnetic disk storage at PARC is about 65 gigabytes (36 gigabytes on private Dorados, 13 gigabytes on private Daybreaks, 14 gigabytes on private Dandelions, and 2 gigabytes on servers that are not file servers.) In addition, there are about 34.5 gigabytes of archived data.

There are a number of problems with the current situation:

f There is not enough storage space.

It is difficult to expand the storage space on an IFS, or to build a new IFS. XNS file servers are limited in the number of disks they can hold, and in the size of those disks. Alpine is still dependent on IFS for backup. SUNs and Symbolics are easily expanded, but cannot be used by most systems at PARC (see below).

f File storage services are fragmented, making file sharing difficult.

For example, our existing D"machine based software environments do not currently understand the protocol used by the SUN workstations, and so cannot use them as file servers. IFS, Alpine, and product file servers do not run any widely accepted protocol. Non"Xerox hardware (Symbolics 36xxs, VAXs, SUNs, MAC"IIs, PC"2s...) need large amounts of software before they can use these file servers, and there are no outside sources for this software.

f Commercial database systems do not act as file servers, and so cannot be used for our filing needs.

In addition, few of our systems currently have software for communicating with a database.

f IFS is based on the Alto and is becoming unmaintainable.

A recent attempt to add a disk drive to an IFS resulted in the IFS being down for the better part of a week. IFS is sometimes very slow. There have been occasions when the time to access a single file page exceeded 60 seconds.

§3.1.2 Short term needs

f Capacity

The file servers and databases at PARC currently have a capacity of about 20 gigabytes. There will be a significant increase in our demands on mass"storage for digitized audio, scanned images, and video. An additional 10 gigabytes of mass storage should suffice for the near term (we hope).

f Interoperability

Our file servers need to speak a widely understood protocol, so that new machines can be added to the network with a minimum of expense. Code needs to be written for existing systems to enable them to act as clients under this protocol.

f Databases

We have some database capability from a networked Oracle system in SCL. In CSL, the Cypress database system is used primarily for mail storage. Because there is no simple network protocol from the D"Machines to use these databases, there is no experience in using these databases outside of their original context. Although we could ignore this problem, it might be useful to try to make these services available on a wider basis.

f Reliability and maintainability

We cannot tolerate servers that are unreliable or difficult to maintain. File servers that depend on non"standard hardware should be avoided. Recovery from a crash on the file server should be rapid and graceful.

f Performance

Of course, we want the maximum performance possible. The SUN file system has a raw transfer speed of up to 250 kilobytes per second (read) and 60 kilobytes per seconds (write). We should not settle for performance significantly worse than this.

§3.1.3 Short term options

The following options are ones that are being considered by groups in each of the laboratories.

f Buy more disks

Buying more disk space to increase the storage capacity of our existing IFSs is only a partial solution. It does not address the issues of reliability, interoperability, or performance. Similarly, expanding the capacity of our product file servers leaves a number of problems unsolved.

f Buy commercial relational DB systems

In fact we do have a small networked Oracle system in SCL. There are lots of problems to do with publicity, protocols, performance, user education, size, system administration, and so forth. In addition, these systems do not serve well to store files.

f Fix Alpine

Alpine lacks a real backup system, other administrative software, and must run on Trident disks. By expending about a man"year, we could fix all of these problems. However, we don't have the people free and don't believe it is a good investment in time to pursue this.

f Extend local software to use NFS

The SUN Network File System (NFS) is an integral part of the SUN operating system. It is a moderately high performance implementation, and the protocol is understood by a large number of machines. Hardware for the SUN is (relatively) cheap and easily available. Finally, PARC already has a fair number of SUNs. It has the drawbacks that it does not support file versions, authentication, or file locking (although there is a service that can be used to provide file locking).

f Buy a Symbolics file server

Symbolics machines implement a proven, robust, remote file system that performs well. They support file locking, file versions, and a number of kinds of hardware. Symbolics machines already understand a number of ªstandardº protocols. However, the prospects for continued health of the company are not good, and this may argue against this option.

f Look at Apollo or VAX systems

Apollo Domain workstations also implement a distributed file system as part of their operating system. They implement a compaction procedure that greatly increases the effective storage capacity of the system. We did not explore this in any detail. A similar alternative is to explore the use of a VAX (NFS) file server. The principal arguments against these is the proliferation of the type of hardware that must be supported in the building.

This committee is leaving the choice among these options to the groups in the PARC laboratories, since the immediate needs of these groups are somewhat different.

§3.2 Medium term (1 " 3 years)

§3.2.1 Why are our needs different from the rest of the world?

In normal DBMS's used in business, most of the data is fixed format. This makes it easier to process and store. Documents used and stored in computers are often preformatted, indexed in standard ways. Much of the success that computers have had in business rests upon this practice (and, in fact, much of the success of standard business practices).

We believe that we need to store information in far more flexible formats than those used in standard systems. A document storage system we believe should support the following:

one document can be linked to many others;

there can be a diversity of form, context, and content;

document history and evolution is recorded;

widely varying sizes can be handled

large numbers of documents can be easily manipulated.

These needs have been noted individually by many others. Building a system that combines these features requires both research and development that we want to do. We have developed these ideas in terms of the needs below.

§3.2.2 Medium term needs

f More natural data model for documents

To store and manipulate electronic documents today, one has to deal with file systems and/or databases with very simple data models. There are restrictions on the size of fields, types of interconnections, enforcement of naming, system cost and performance, and the size of the information space.

f We need multi"media storage

Multiple types of objects must be stored. These should include plain text, structured text, scanned image, mail message, digitized voice, digitized video, bitmaps, uninterpreted byte arrays. If the system distinguishes types, it can provide special purpose compression and data search algorithms.

f Data links

We need to manipulate links between different data elements. This is often called the hypertext model. Supporting links between text, pictures, and voice makes this more like ªhypermediaº. With this data model in the underlying system, it can provide more efficient storage and retrieval, and will match users' models of stored information. This should allow support of current projects such as Notecards and Colab. It is also compatible with ideas developed for the Object Service.

f Versions and alternatives

Documents by their very nature evolve. As they evolve, new versions of them are created. In addition, alternatives to a document may be created. Version and alternative trees must be supported in the document base system.

f Naming, attributes, and indexing

We want to support multiple access styles in the document base. By allowing users to give documents names, we can support standard file server needs. By allowing attributes and keywords on documents, with appropriate indices, we can support simple rapid access by document content or history.

f Multi"user concurrent access

Documents must be sharable. Several clients must be able to access to the same document simultaneously.

f Distributed, remote servers

Documents will be stored on servers found on the internet. Servers must be able to exchange data, support different protocols for users, and share data. The server must support transactions. Locking at the granularity of at least a document is required. Very large objects require that they have page level access (or access to document fragments).

f Alerters and triggers

Alerters are a simple form of triggers. Some class of events trigger the sending of a ªmessageº to clients. Events can include insert, delete, read, lock, or update of some class of objects. Alerters provide a ªhookº for other systems (e.g., Notecards or Colab). Alerters can be used to assist cache and screen maintenance on workstations, and to trigger some service at document creation (e.g., document recognition).

f Large capacity

The requirements for digital on"line storage in this time frame is very large. We want to store very large files (e.g. images, video sequences etc). To not be limited in the next few years we should probably plan on increasing our storage available by at least a factor of10. This requires between 500 and 1000 gigabytes.

A storage hierarchy is required to get acceptable performance with reasonable cost. Not all of the documents must be ªimmediatelyº accessible. Delays of seconds are acceptable for infrequently referenced or archived documents. Archival storage is required to keep all relevant versions of objects. Optical media are an appropriate bottom end.

f Data compression

Data compression is a useful adjunct for the system. For many classes of document (e.g., scanned images), data compression radically changes the amount of storage available for a fixed cost (an order of magnitude or more).

f High performance

While we are unable to give exact number, we must have a system that allows easy experimentation with information retrieval strategies. In the same way that Dorados were higher performance personal machines than anyone else had available at the time they came on board, the document base server must initially overwhelm the needs of the problem.

f Building a Service

To support a service, the system must be robust, available, and replicable. It must provide administrative functions such as debugging, backup, historical logging, and monitoring. Security and access control must be provided. No server should be isolated. It must cooperate with other server and foreign data systems.

§3.2.3 Medium term options

f Currently available systems

Conventional database management systems such as Oracle and RTI Ingres have a number of important properties:

they are robust (they recover from CPU and media failure);

have highly available data (using data replication);

and present the client an understandable data model.

However these relational systems won't fill our document storage and retrieval needs because they are limited in:

the size of fields

a data model that doesn't easily support hypertext

problems with extensibility of types

no support for hierarchical data models

Many desirable document operations, such as finding the transitive closure of a bibliography, will inevitably take too long for users.

f External RDBMS systems being developed

We considered whether it would be useful to try to set up a joint project with other organizations outside of Xerox. Before we actually start our own project we will pursue each of these options somewhat further. The problems that we are worried about are:

Starburst is an IBM ARC project. It is very unlikely that they would agree to joint research.

Postgres (Berkeley) and Exodus (Wisconsin) both are university projects. Hence there is some hope that they might be willing to cooperate. However, both are RDBMS projects. Their data model does not match all that well with what we see as the requirements.

A few databases have ªlong fieldsº or images as primitive data types. Wang, Sybase, and Nixdorf are examples (we think). This solves the size problems but not the data model and other problems. However it is difficult to get access to sources and to keep up with their system updates.

f Filenet, Cygnet, Access ...

Optical disk servers are (or shortly will be) sold by some companies. Although the jukebox and optical disk hardware are interesting, the software layer they sell does not satisfy most of our needs.

Purchase of a full system, such as from Filenet, is not an option. The scanners, printers, and workstations may not be those that integrate with Xerox's long term product plans.

Jukeboxs and optical disk hardware are not commodity items. Other technologies may make them obsolete in a few years. Service, lifetime of product, and maintenance are questionable. We think it will probably be the right thing to get one of these as a bottom end for our project, but for the long term there is some risk in dealing with startup companies. OEM prices for a Filenet jukebox are about ¤160K while Cygnet is somewhat cheaper at ¤115K, though the features are somewhat different.

f Expand the Object Service Project

The Object Service has different goals and time scale from those presented here; the Object Service is exploring a higher level execution based model of storage, and the utility of persistent processes. It also is exploring a seamless integration of long term storage with the Smalltalk programming system. Rather than trying to combine the current needs for a document base with the long term goals of the Object Service project, it seems more appropriate in the medium term to have separate projects. The server discussed here may be able to act as a storage backend for the Object Service.

f Build a system that meets our document storage needs

This is what we recommend. The server would implement the data model developed jointly between CSL and ISL over the last few months. For large volume storage, the server would use either a jukebox optical disk unit directly connected to the server, a network server with a jukebox optical disk, or use a relational database system in conjunction with a jukebox or a networked jukebox. Hagmann and Kent have agreed to act as coordinators (leaders?) of this new project. A choice must be made whether to implement the service in their natural language (Cedar) or to explore other options. The viability of this choice is dependent on having a porting strategy of Cedar to new hardware, and the availability of new hardware. To make this system usable on our Dorados, we will need 10 mHz Ethernet boards on them. This is another source of risk as well as a cost item.

To build this service requires:

the purchase of a jukebox optical disk for archival storage (¤120K " ¤200K).

possibly the purchase of index server software, a RDBMS system, or file server software (¤0 " ¤50K).

the purchase of a computational and communications platforms. These include computers with large memories for the servers, several gigabytes of magnetic disk storage (5 gigabytes cost ¤40K), 10 mHz Ethernet communication gear installed on servers and clients and/or other communication media (total costs for this part will be determined by decisions that have not yet been taken).

building a group of 5 to 7 qualified staff to implement the service that satisfies the medium term document storage needs detailed in this document.

The server would act as a front end, provide the data model, and manage the short, medium, and long term storage. It would most likely be written in Cedar and initially run on Dorados. However, it must be portable and must move to the server architecture that will result from the ªComputational Baseº committee.

The staffing of the project is critical to any time table we might propose. However, to give an idea of phasing, some sort of milestones were requested by the OCM. With reservations, we propose the following milestones taken from ªstart of codingº:

9 months: large capacity file server speaking a ªstandardº protocol (NFS?) with backup

18 months: add hypertext data model with alerters

24 months: add versions, alternatives, compression, and other features that will become evident by that time.

This conclusion rests, in part, on the assumptions we have made. To be complete, we should understand the implications of our assumptions and validate our assumptions. Unfortunately, time and manpower have not allowed us to be as complete as we would like to be. We have relied principally on our own experiences and understanding of external research and events. It may be appropriate to have our conclusion reviewed by an outside consultant.

§3.3 Long term (> 3 years)

Although we have focused on the medium term needs for a document base, we do not think this should preclude longer term research on alternative models. The one long term project currently underway at PARC is the Object Service project. We do not think that the Object Service project as it is currently focused should be the base for our medium term work, and below we list the significant differences between the Object Service and Hypertext Service.

f Time scale: We hope to have the medium term project done in two to three years, while the Object Service should take five years.

f Risk: The Object Service is more risky, particularly with regards to performance because the Hypertext Service is simpler and has more of a base in known technology.

f Database scale: The Object Service will not be a small scale system, but will be suitable for storing the personal and shared databases of 20"40 people. It will not focus on issues of storing massive amounts of data.

f Versions and alternatives: Because the Hypertext Service will in part replace our current use of files, it will make provision for handling versions and alternatives in the model. This is not a current goal for the Object Service.

f Execution model: The Object Service provides an execution model. The project is exploring the issue of seamless integration into a programming language, and will be the basis of the next Smalltalk system. The Hypertext Service avoids any execution model, does not store active objects, and leaves all semantics to its clients.

f Persistent processes: Client processes are intended to run in the Object Service. These should be persistent across crashes and/or be started when queued by events. This is outside the focus of the Hypertext Service.

f Evolution: The Object Service intends to support changes in object formats and object semantics over time. These formats and behaviors are shared among, and determined by users of the system. The Hypertext Service will support only a small number of primitive types, and a fixed set of operations.

We see that the long term needs will be satisfied by a system that combines the ideas being explored in the Hypertext Service and the Object Service with those in systems developed outside PARC.

§4 Conclusion

We believe PARC can take advantage of its unique experience and needs for database technology to build a system that would be both novel and interesting. The needs derive from our experience in building programming environments, tools to aid writing (Notecards), and multi"user collaboration tools (Colab). We also have a developing need for a large scale database to handle the multimedia data (e.g. voice and images) that are being integrated into our computational environment.

We propose a three year project to build a system that will support these needs for all the PARC laboratories. It will encourage interoperability, data sharing, and cooperation between the laboratories. If we don't do this now, there will many small projects to develop part of the capabilities of the Hypertext Service. This project also should provide a prototype of the filing service that Xerox needs for its core business.