Yggdrasil Overview May 4, 1989 Yggdrasil A large scale hypertext database system Bob Hagmann Xerox PARC Introduction Problem Store and retrieve large numbers of possibly large nodes ("objects" or electronic documents) Motivations research support PARC's filing/database needs Xerox's document processing strategy Goals Handle large number of nodes of any size Use an appropriate data model Non-navigational access (non-procedural) Deal with scale Interoperate Have transactions, robustness, good performance, fast recovery, good availability, and access control Have hooks for building on system Project Name Ygg|dra|sil n. Also Yg|dra|sil. Norse Mythology. The great ash tree that holds together earth, heaven, and hell by its roots and branches. [Old Norse, probably the horse of Yggr'' : Yggr, name of Odin, from yggr, variant of uggr, frightful (see ugly) + drasill, horse.] "Whatever the details, there is no doubt of the cosmic importance of Yggdrasil" European Mythology, Jacqueline Simpson Computer Science Challenges of the 1990's Ubiquitous computing Parallelism - massive parallelism, neural nets, and multiprocessors Large scale distributed systems _ Your favorite project goes here 8 Using low cost, moderate access time, tertiary storage optical tape (e. g., ICI's Digital Paper) high density magnetic tape transverse and helical recording ??? For example: magnetic tape 4 mm, 8 mm, VHS, 19 mm and 1" tape under development Ampex ACL 19 mm tape 28.75, 81.1, or 186 GByte/tape 15 mbyte/sec library: 7.1 Tbyte (uses small tapes) 256 cassettes load/unload 8 sec search 30 sec 55 GByte/ft3 366 GByte/ft2 $60/GByte Documents Document: stored information that can be perceived by humans Examples: text files, object files, PDL files, books, newspapers, movies, audio tapes, music, statues, paintings, fingerprints, and frescos. Electronic documents: the subset of documents that can be dealt with by computers Historical Perspective on Stored Information (European view) Middle ages (guilds; oral tradition; bards; manuscripts and libraries owned by religion) Moors fall => Arabic libraries into Christian hands (7th C.) and monks translate to Latin Paper arrives in Europe (11th to 12th C.) Secular universities (12th C.) Renaissance: growth of individual (14-16th C.) Printing press (~1450) Scholarly journal (17th C.) Public libraries and public education (lending library 18th C.; free public library 19th C.) Xerox 914 copier (1959) Perspective on Computers (European view) Boxes of cards File systems Access methods for files Hierarchical and network databases Relational databases ??? My Local Branch Library Books large print, childrens Magazines Newspapers Consumer information Transportation and community information VHS tapes CD's Audio tapes Records Technology Change Optical disks Optical disk jukeboxes High capacity magnetic disks Decreasing cost of main memory Fast commercial microprocessors/workstations 8High capacity optical/magnetic tapes Scanners FDDI communications Fax Electronic printers CD ROM Information services g deal with change and scale Perspective Summary Vast changes over 15 Centuries "Information worker" "Future Shock" Amount of information growing Speed of delivery increasing observe the spread of the "cold fusion" papers Media and delivery systems evolving Project Build a large scale hypertext database server No user interface -- this is a database Terabyte in scale ter|a|tol|o|gy n. The biological study of the production, development, anatomy, and classification of monsters RISC processors, Dragon, and beyond Ethernet or FDDI Lots of memory (100's of megabytes) Lots of MIPS (10-100's) Modest number of processors (1-16) Magnetic disk cache (10's of gigabytes) Lots of optical disk (a terabyte) Mach and Camelot TCP/IP and Xerox XNS protocols Written in Cedar Cedar, Mach and Camelot Cedar From Mesa: Algol family language with strong typing, monitors, threads, exceptions, ADT's Cedar adds garbage collection, generic types, lists, atoms, ... Mesa inspired Modula-2 Language and runtime "up" under SunOS and Mach Mach Threads, message based, multiprocessor, external pagers Camelot Transaction facility on top of Mach Logging, commit message protocols, recoverable storage management, media recovery, backup, name service, ... Might be throw away Seven Key Ideas Nodes (W documents or objects without semantics) Typed links Nodes have properties (W attributes) Containers group nodes Set oriented, non-navigational (non-procedural) access Nodes can be named Versions and alternatives for nodes System Summary (part 1) Server architecture Large number of nodes of vastly varying sizes Nodes ``types'' (extensible) - few interpreted at server Hypertext: nodes can be connected via links Nodes can be named (g File server interface) Nodes can have attributes and keywords Nodes are grouped into contexts called containers Keyword and other indices maintained per container Versions and alternatives System Summary (part 2) Data compression and decompression On-line archival storage Use RPC marshalling code to encode contents Alerters (send a message when an event occurs) Page level access, access control, transactions, robust, performance, recovery, and availability Hooks for multi-server and foreign server support Not an OODBMS ``Store the bits and get out of my way'' Major execution on the server not a requirement S RDBMS execution Full blown execution is hard Performance or security - choose one Locking and Deadlock Evolution of types Match of execution model to programming model Query optimization Looping Longevity ... Simple execution is doable Leave It To The Client Find the set nodes non-navigationally Let the client further filter nodes Let the client build appropriate data structures and use semantics for the current problem Add hooks to the database Alerters (e.g., be informed when something changes) Type system Yggdrasil and IFS Comparison topic IFS Yggdrasil ratio size 1 Gbyte 1 Tbyte 1000 CPU .2 MIP 200 MIP 1000 memory 128 KB 128 MB 1000 net read 28 1000 35 bandwidth (Kbytes/sec) latency 50 msec 3-30 msec 16 - - 1 min (to 20 sec) 20000 Yggdrasil Phases 0 build hypertext, naming (hence NFS interface), indexing, and containers use Camelot/Mach and/or Sun OS skip systems issues: performance, recovery, availability, access control, alerters, archival storage, and data compression postpone versions and alternatives 1a: archival storage 1b: versions and alternatives 1c: alerters and better locking and transactions 1d: availability, access control, and data compression 2 non-navigational access object-oriented Yggdrasil Status Four people claim allegiance to project High level design done (0 and 1) Wildly coding and starting testing of 0 on Dorados Porting to Sun OS Getting Cedar up on Mach Getting equipment larger machines optical disk jukebox early use as on-line archival storage for System/33 and file servers without Yggdrasil Contributions Selection of existing ideas to build an artifact full featured hypertext set oriented Large scale Non-navigational access to hypertext Database and file server merger Κq• WordlistsYggdrasil.wordlist–CPressFonts 1.00 in leftMargin 1.00 in rightMargin 6.5 in lineLength–"slides" style˜Iunleaded•Mark insideFooteršΡdis˜K– outsideFooterš ˜ Icenter–YXCPrintFonts 24 bp size 32 bp leading centered lineFormatting centered lastLineFormattingšΟb ˜ LšΟi(˜(–YXCPrintFonts 24 bp size 24 bp leading centered lineFormatting centered lastLineFormatting˜L˜Lšž ˜ Lšž ˜ —title˜ raggedšž˜N˜\—šž ˜ N˜Nšœ$˜$N˜$——˜Nšž(˜(Nšž˜Nšž(˜(Nšž˜Nšž ˜ Nšže˜eNšž!˜!—˜ IblockšΠbo œŸœΡbosœŸœ† ŸœŸœ Ÿœ œŸœ ˜O˜šœO˜OOšΟzœ˜&——˜)Nšž˜NšžC˜CNšž˜NšΟmž ˜!N˜š£ž7˜8N˜)˜N˜ —N˜——˜Nšž4˜4šž ˜ N˜ N˜N˜ šœ˜N˜Nšœ ˜ Nšœ˜Nšœ ˜ Nšœ Οu˜ Nšœ €˜ Nšœ ˜ ———˜ šž<˜