DRAFT -- March 15, 1988 9:35:17 am PST -- DRAFT
Internal Memo

XEROX
To    From
Distribution    Bob Hagmann
    PARC/CSL


Subject    Date
Yggdrasil Overall System Plan    March 15, 1988


Introduction
This memo addresses the issue of how to implement Yggdrasil.  There are three sections (apart from this one).
First, there is the implementation plan.  This is the general structure of the system and how it is layered and installed on the 
execution environment.
Second, there is the implementation staging.  This sets the milestones for the building of the service.
Finally, there is a section on future options.  Yggdrasil should be the first of a series of projects to deal with document 
database requirements and research.  By calling out options that we may want to pursue later, we hope that the implementation of 
the system will provide us with the needed flexibility in (at least) these directions.
Implementation plan
System components
We want to build a service that provides the data model, acts as a front end for the (optional) archival storage subsystem, and 
manages the short, medium, and long term storage.
The archival storage will preferably be provided by a network service.  One example of this is the OSAR/Index Server that FileNet 
plans to sell.  Our fallback position is to integrate an optical disk jukebox into the server directly and to write all the device 
management, storage allocation, error reporting, etc. code ourselves. 
Language, hardware, operating system, communications, protocols and programming environment
We propose to write the system in the Cedar language.  We think that this is the best choice since we are knowledgeable in the 
language, it provides many of the features we want, it is efficient, and it has a good programming environment.
We plan to initially run on Dorados.  However, the system must be portable and must move to the server architecture that will 
result from the ``OCM Committee on Computational Base'' committee (Common HOWL).  This implies that we must write our code in the 
evolving machine independent Cedar style.  We cannot depend on anything in the Cedar system we cannot port (e. g., the semantics of 
processes).  We plan to use the Mimosa ``Cedar to C'' translator as our eventual compiler.  To run, we will need runtime support 
and an appropriate OS.  The system should be able to run on Cedar Dorados, SUN's under SUN OS using ``Cedar to C'' and the runtime, 
and on Cedar Dragons.  Which ones we actually build will depend on the schedules and completion of the ``Cedar to C'' translator 
and runtime support, Dragons, and Cedar Port.
We initially will support a subset of our ``Native'' protocol.  This means our servers will have to also support the lower level 
protocols needed by this protocol (SUN RPC layered over TCP/IP and/or UDP; and/or Cedar RPC; and/or XNS Courier).  Later, we will 
support the ``Native'' protocol with full support for data model, and some of SUN NFS, XNS Filing, and ANSI/CCITT/OSI/... ``Filing 
and Retrieval''.  Which higher level protocols we support will depend on staffing and possibly outside assistence.
The Front End Cluster
To provide for availability, there must be more than one physical machine capable of running a service.  The set of machines that 
provide this is called a cluster.  The cluster consists of at least one front end, but a typical system will have two or three 
front ends.  The front ends should be interconnected with good communications (private 10 mHz Ethernet or faster communications).  
The stable storage (e.g., magnetic disks) of a service are connected to more than one front end computer.  The idea is that if one 
fails, the other can still access the disks.  Depending on what you think can fail and/or what a site can afford, this could mean 
double ported disks, double ported controllers, and/or replicated buses.
The front ends must keep their state consistent on disk.  The front end running a service is found via a name server.  On machine 
failure, the running front ends agree that the failed machine is down, and then elect machines that can access the disks of the 
failed service to take over the services of the failed machine.  The elected front end machines then run crash recovery and come up 
as the service.
Upon repair of the failed machine, the running cluster agrees that some services will be moved to the repaired machine.  The change 
over is crude: we just crash the service on one machine and bring it up on the other.  (Maybe we can do something more user 
friendly like send mail that we are about to crash, wait, then crash).
One consequence of this strategy is that services of a cluster are either all up or all down.  This restriction can be reduced 
slightly: a controlled shutdown of a service does not have to bring the cluster down.
Implementation staging
Staffing of the project is critical to any time table we might propose.  To give an idea of phasing, some sort of milestones are 
included.  With reservations, we propose the following milestones taken from ``start of coding'':
10 months: Large capacity server (10-100 Gigabytes) speaking a subset of our ``Native'' protocol with backup.  This server has the 
basic hypertext data model with naming, attributes, and containers.  This server will be put in daily public use at PARC.  Missing 
from the implementation are:
archival storage
protocol frontends for file servers
alerters
versions and alternatives
compression
automatic indexing
query language
14 months: add archival storage (~1 Terabyte)
18 months: add versions and alternatives, alerters, and automatic indexing
24 months: add compression, other protocols, a query language, all other requirements, and other features that will become evident 
by that time.
Future options
Execution model on the server
Eventually, we would like to see the results of object-oriented and/or extensible database research on our server.  Significant 
problems arise in their performance, locking, security, evolution, archiving, execution model, robustness, looping client code, and 
fairness, as well as other problems.
Plan for a memory hierarchy with variable performance
We need a structure that can accommodate main memory, stable main memory, {CCD, Magnetic Bubbles, ...}, drum or fixed head disk, 
low performance magnetic disks, high performance magnetic disks, mounted optical disks, jukebox optical disks, and a picker for 
magnetic tape archive.
Plan for read/write optical disks
At least at first, they will (probably) not perform as well as magnetic hard disks.
Plan for other archival servers, devices, and media
Various kinds of optical tape, microfilm, and magnetic tape may become available.  Only unattended systems are of interest (e. g., 
systems that use mechanical loading of media, such as a jukebox).
Plan for stable main memory
Battery backed-up RAM is available.  We should plan to eventually use it.
Plan for additional protocols
ANSI/CCITT/OSI/... Filing and Retrieval is an example.
Plan for interoperation with commercial database and information services
Plan for evolution of standards
Character sets, protocols, compression, PDL's, document descriptions, images, ...
Worry about parallel computer architectures
Maybe something in the Connection Machine world will become a good search engine.
Worry about "processor per head" architectures 
TRW's Fast Finder Chip idea, modified to handle synonyms and word stemming, might be real powerful as a search engine.
Plan for RDBMS
We may have to extend/integrate RDBMS into the server.