DRAFT -- March 22, 1988 2:38:20 pm PST -- DRAFT Internal Memo XEROX To From Distribution Bob Hagmann PARC/CSL Subject Date Yggdrasil Implementation: Quick and Dirty March 22, 1988 Introduction This memo addresses the issue of how to implement phase zero of Yggdrasil. In this plan, we attempt to get a minimum function server usable at an early date. To a large extent, we use existing software to provide the bulk of the functionality. In particular, we would use the UNIX file system to store the data. We defer many of the more interesting aspects of the system, and try to get early users. We provide a basic hypertext system with few of the frills. This system should be useful to NoteCards. Alternatively, the use the UNIX file system can be viewed as scaffolding to build the higher layers of the system. This decouples the storage layer from the basic upper layers proposed here. We can proceed down both paths at the same time. Features needed in Phase 0 Below are the features of the system as taken from the overview memo. In strikeout are the features that are to be omitted in phase 0. f large number of documents but, no archival storage in Phase 0 f documents can be large f documents can be small but, they all are stored in files so there can be lots of fragmentation! f documents ``types'' understood by the system include text, bitmaps, digitized audio, digitized video (?), graphics, ... f documents ``types'' are extensible f documents can be connected via links (hypertext) there are no references and there is only historic and foreign links (no version links) f documents can be named via a ``hierarchical'' name system cycles are not allowed f documents can have attributes and keywords f documents are grouped into contexts called containers cycles are not allowed in the container DAG containment and children are identical f keyword and other indices maintained per container f support for versions and alternatives f data compression and decompression f support for archival storage f alerters (send a message when a system event occurs) f query language and query optimizer f page level access within a document f transactions f robust f good performance f fast recovery f good availability f access control f hooks for multi-server and foreign server support Overall Implementation Plan The goal of Phase 0 is to get up a running server with a restricted interface, modest performance, poor recovery time, modest availability, and poor fragmentation. Phase 0 of Yggdrasil will sacrifice many of the interesting systems features of the project. However, the server would support the basic hypertext, naming, and indexing that are the key features of the data model. When the server runs in this mode, then it is possible to put up a server for a few clients. By putting up a server early, we hope to be able to measure its use, obtain client feedback, and build a history of success in the project. In any case, this approach allows us to scaffold up the higher levels of the system on available pieces of software. The essential idea is to get maximum leverage out of the UNIX file system. We would run under SUN OS and use the "Fast File System." Reserved files would be used for key system data. The logging code of Yggdrasil would be the same. It would log to a UNIX file, using the fsync system call to force the log file. The directory system of the UNIX file system is used extensively. A directory is a container. Sub-containers are directories in the container's directory. Specially named files in the directory have the contents ($contents), index ($index), links ($links), and properties ($properties) files. Objects directly contained in the container appear as file triplets: a unique name followed by one of three extensions. The extension contents contains the file's contents, the extension links contains the file's links, and extension properties contains the file's properties. Either or both of the links and properties files may be missing if they are empty. Links are encoded using the file name of the "from" or "to" object. If the object moves, then all links to it must be updated. It will not be possible to find out in which containers an object resides. A typical object will take three (3) disk pages. If (when) this is a problem, we could merge the three files into one file for the simple cases (e. g., if each was < 1/3 the size of a file block, then use a fixed allocation scheme). Conversion to the general case would have to be automatic. Large containers are a problem. The UNIX file system searchs a directory linearly, so finding a particular entry can take a long time. Advantages and Disadvantages of this Approach Verses Previous Approach Advantages f early clients/client feedback f apparent progress f lots of shared code with final system f implications for System 33 (indexing and storage) f flexible about next stage (archival storage, better file layer, more of the data model, ...) Disadvantages f doing this may prevent us from later "doing it right" f bad fragmentation (memory and disk) f modest performance using the directory system is a bad idea due to UNIX's linear search only 64 file descriptors g "page" the open files links as file names means lots of directory searches multiple files per object (g? small objects encoded into one file) if an object moves, then all links to it must be updated f only basic hypertext data model f backup, crash recovery, and other "systems" issues only partly satisfied Parts of Cedar Needed (not already done?) BTree processes RedBlackTree File (FileStream or UXIO streams would be OK) SafeStorage (Finalization) Interfaces Needed by Cypress/Alpine (less those already ported, those with trivial use (e. g., wordsPerPage from PrincOps), or only needed by SkiPatrol, FTP, debugging, AlpineFSImpl, Cedar RPC generated files, directory implementations, or AlpineControlPanelImpl) BTree CountedVM File FileBackdoor (erase volume and root file finding) GVNames (authentication) PrincOpsUtils (LongCopy, LongMove, LongZero, BITAND) Process (Pause, SecondsToTicks, Detach, Ticks, SetPriority, EnableAborts, SetTimeout, Seconds, MsecToTicks, GetCurrent, Milliseconds, DisableTimeout, Yield, CheckForAbort) RedBlackTree (used in file page and log managers) RefTab RPC RPCLupine SymTab UserCredentials (Get) VM Interfaces Needed by Cypress/Alpine (already ported) Ascii Atom Basics BasicTime CardTab DebuggerSwap IO List Pup RefTab Rope RuntimeError SafeStorage SymTab Implementation factoring Server Level 0 Disk.mesa and below Use one big UNIX file for metadata storage. [Or use a few files.] Low level communications support SUN OS/Mach should provide this. Runtime support B-tree and other data structures from the SUN OS release or from Cedar. Server Level 1 8 Buffer manager Use the UNIX file system cache! 8 Log manager Similar to Alpine. Use a UNIX file as the log. Open it with the "sync bit" set so that writes go directly to the file. 8 Lock manager Evaluate Alpine. Only need object (file) locks. 8 Transaction manager Evaluate Alpine. Server Level 2 88 Object encoder/decoder Similar to the page layer of Cypress. Only needed for metadata (links, properties, and indices). This is a major part of the effort. [To go to phase 1, the property and index code should be reuseable, while the link code will have to be changed to use DIDs instead of file names.] Backup and backup recovery Store everything in UNIX files and use UNIX backup. Live with the problems of inconsistencies. Bring the server down just before backup at night and back up after backup. Container implementation Build off of the UNIX file system hierarchical directories. A container is a directory. All directories found in this directory are children. No loops. Crash recovery Try to force the page cache out, but mostly loose here for data. Be careful for metadata and the log. DID manager Encode DID as prefix Performance monitoring and historical/debugging logging Fake it Client Client stub for Cedar and Interlisp (NoteCards) Test code from Cedar client Κ*• WordlistsYggdrasil.wordlist•StyleDefη BeginStyle (Cedar) AttachStyle (DefaultNest) "for print" {DoNest} PrintRule (look.w) "annotation looks" { "Cream" family bold face } ScreenRule (look.w) "annotation looks" { AlternateFontFamily bold+italic face } PrintRule (firstHeadersAfterPage) {0} .cvx .def (root) "format for root nodes" { cedarRoot docStandard 36 pt topMargin 36 pt headerMargin 1.75 in leftMargin 1.5 in rightMargin 9.00 in lineLength 24 pt topIndent 24 pt topLeading 0 leftIndent 10 pt rightIndent } ScreenRule (root) "format for root nodes" { cedarRoot docStandard 36 pt topMargin 36 pt headerMargin 1.75 in leftMargin 1.5 in rightMargin 5.25 in lineLength 24 pt topIndent 24 pt topLeading 0 leftIndent 10 pt rightIndent } PrintRule (positionInternalMemoLogo) "Xerox logo: screen" { docStandard 1 pt leading 1 pt topLeading 1 pt bottomLeading } ScreenRule (positionInternalMemoLogo) "for Xerox logo" { docStandard 1 pt leading 1 pt topLeading -28 pt bottomLeading -1.5 in leftIndent } PrintRule (internalMemoLogo) "Xerox logo: screen" { "Logo" family 18 bp size 20 pt topLeading 20 pt bottomLeading } ScreenRule (internalMemoLogo) "for Xerox logo" { "Logo" family 18 pt size 12 pt leading -28 pt topLeading 36 pt bottomLeading -1.5 in leftIndent } PrintRule (memoHead) "for the To, From, Subject nodes at front of memos" { docStandard AlternateFontFamily 240 pt tabStops } StyleRule (raggedIndent) "ragged-right block" { block flushLeft lineFormatting flushLeft lastLineFormatting 1.5 em restIndent } StyleRule EndStyle˜Iunleaded•Mark centerFooteršΠos/˜/Iblock– insideHeaderšΡbox ˜ Ipositioninternalmemologo˜Iinternalmemologošœ˜memoheadšΟsœŸ˜Ošœ Οtœ ˜Ošœ ˜ O˜O˜O˜O˜O˜—šŸœŸ˜ Ošœ8˜8O˜O˜—head˜ Lšœυ˜υL˜†L˜ρ—˜L˜‡ raggedindent•CharSetsο• CharPropsPostfix XCPrintFontsšœ˜IindentšΟi#˜#—Q–ο–Postfix XCPrintFontsšœ˜ –ο–Postfix XCPrintFontsšœ˜Rš‘I˜I—Q–οy–Postfix XCPrintFontsyšœz˜zQ–ο$–Postfix XCPrintFonts$šœ%˜% –ο2–Postfix XCPrintFonts2šœ3˜3Rš‘W˜W— –ο;–Postfix XCPrintFonts;šœ<˜