Dealer: Yggdrasil Status October 5, 1988 Yggdrasil A large scale hypertext database system Bob Hagmann Introduction Database effort in CSL research support our filing needs basis for retrieval experiments store images, programs, text, audio, mail, ... document processing ... and System 33 System 33 three months scanning document recognition storage retrieval conversions protocols printing ? Yggdrasil years "client server" database so good that all the bits end up here Project Name Ygg|dra|sil n. Also Yg|dra|sil. Norse Mythology. The great ash tree that holds together earth, heaven, and hell by its roots and branches. [Old Norse, probably the horse of Yggr'' : Yggr, name of Odin, from yggr, variant of uggr, frightful (see ugly) + drasill, horse.] Alternative Name: Hypertext Hy|per|text n. (pronounced as if spelled Rtext) Hypertext without the hype. Historical Perspective on Stored Information (European view) Dark ages (guilds, oral tradition, bards, manuscripts) Moors fall => Arabic libraries into Christian hands Monks translate Arabic to Latin Constantinople falls => Renaissance. Printing press Public libraries and public education Large publishing industry Computers boxes of cards file systems access methods for files hierarchical and network databases relational databases ??? Information Storage Needs ``Store the bits and get out of my way'' Major execution on the server not a requirement Three principle sources of needs Distributed Notecards in SSL Large capacity and high performance file server Software storage for programming environments Others Voice project in CSL Storage of scanned images Mail storage Library support Support evolving standards for file systems and document retrieval ... Technology Change Optical disks Optical disk jukeboxes High capacity magnetic disks Decreasing cost of main memory Fast commercial microprocessors/workstations High capacity optical/magnetic tapes Scanners FDDI communications Fax Electronic printers CD ROM Information services Project Build a large scale hypertext database server No user interface -- this is a database SUN, Dragon, and beyond Ethernet or FDDI Lots of memory (100's of megabytes) Lots of MIPS (10-100's) Modest number of processors (1-16) Lots of magnetic disk (10's of gigabytes) Lots of optical disk (a terabyte) TCP/IP and XNS Written in Cedar Mach and Camelot Mach and Camelot Mach Rick Rashid's project at CMU UNIX compatible Lightweight processes, message based, multiprocessor, external pagers Camelot Transaction facility on top of Mach Logging, commit message protocols, recoverable storage management, media recovery, backup, name service, ... Might be throw away Mach, Camelot, and Cedar Cedarboot runs under Mach! ... but debugging, PCR, communications, threads, fd's Six Key Ideas Objects ({ documents) Typed links Objects have properties ({ attributes) Containers group objects Indices automatically built on properties Documents can be named System Summary Server architecture Large number of documents of vastly varying sizes Document ``types'' (extensible) - few interpreted at server Hypertext: documents can be connected via links Documents can be named Documents can have attributes and keywords Documents are grouped into contexts called containers Keyword and other indices maintained per container Versions and alternatives Data compression and decompression On-line archival storage Alerters (send a message when an event occurs) Page level access, access control, transactions, robust, performance, recovery, and availability Hooks for multi-server and foreign server support Not an OODBMS ``Store the bits and get out of my way'' Major execution on the server not a requirement Full blown execution is hard Performance Locking Security Execution model Match of execution model to programming model Query optimization Looping ... Simple execution is doable Leave It To The Client Find the set objects of interest non-navigationally Let the client further filter objects Let the client build appropriate data structures for the current problem Add hooks to the database Alerters (e.g., be informed when something changes) Type system Yggdrasil and IFS Comparison topic IFS Yggdrasil ratio size 1 Gbyte 1 Tbyte 1000 CPU .2 MIP 8 MIP 40 memory 128 KB 128 MB 1000 net read 28 1000 35 bandwidth (Kbytes/sec) latency 50 msec 3-30 msec 16 - - 1 min 20000 Yggdrasil Phases 0 build hypertext, naming, indexing, and containers (mostly) use Camelot/Mach skip systems issues: performance, recovery, availability, access control, alerters, archival storage, and data compression postpone versions and alternatives 1a archival storage 1b versions and alternatives 1c alerters and better locking 1d availability, access control, and data compression Yggdrasil Status High level design done Wildly coding version 0 BJ is minding the Mach/Camelot store Brian Oki shows up in a few weeks RR says we still have a slot Optical disk jukebox next year Recruiting