ArchivesDoc.tioga
Tim Diebert: March 13, 1987 10:59:48 am PST
Chauser, October 24, 1991 5:07 pm PDT
Foote, May 16, 1991 11:55 am PDT
ArchivistBTreeDoc
PCEDAR 2.0 —
Archives For PCedar
Tim Diebert
Carl Hauser
© Copyright 1985, 1987, 1990 Xerox Corporation. All rights reserved.
Abstract: This package contains procedures to look at the Cedar Archivist Directory containing all of the names and places relating to the current archives.
Created by: Carl Hauser, from the Cedar 7.0 ArchivistBTree and ListArchives packages by Tim Diebert
Maintained by: Carl Hauser <Carl Hauser:PARC:Xerox>
Keywords: Cedar Archives
XEROX  Xerox Corporation
   Palo Alto Research Center
   3333 Coyote Hill Road
   Palo Alto, California 94304

Overview
The Cedar Archivist is a DCedar program that writes files from IFS's to tape and keeps track of where they are. Over the years it has stored about 300,000 files on tape. Each file is written on two tapes, and an entry made in a BTree directory that maps from file names to tape identifications.
Unfortunately, the DCedar BTree is not readable by the PCedar BTree package (problems with byte ordering, etc.). Hence this package was created for PCedar to read (only) a compressed version of the Archivist BTree that is created by the DCedar ArchivistBTree package.
The Archives interface provides procedures for opening, closing and enumerating the directory. (See Archives.mesa.)
UserProfile Entries
Archives.DirectoryName: "/ux/project/archivist/Archives.CompressedDirectory"
Commands
The implementation provides two commands for looking at the directory:
UncompressDirectory
enumerates the entire directory, placing its output, 1 line per file, on cmd.out (e.g. use commander output redirection to place the output in a file);
ListArchives pattern ...
enumerates files matching each of the patterns and places its output on cmd.out in a format similar to that used by the cedar ListFiles command. The patterns should be in brackety syntax. The output format looks like this

[Ivy]<CHauser>Modula-2>*
CursorMouse.DEF!1 25 May 85 16:40:54 PDT  T10439P T10439B complete
CursorMouse.MOD!1 17 Sep 85 22:31:26 PDT  T10439P T10439B complete
CursorMouse.SBL!1 19 Sep 85 23:47:29 PDT  T10439P T10439B complete
^         ^         ^   ^   ^
file name       file create date      primary  secondary state
                   tape tape
The theory is that with the tape identification information you can go to someone who knows something about the Archivist and get the files back.
Directory format
I really hope you don't need to look at this to use this package!
The compressed directory format supported by this package and the corresponding procedure in DCedar ArchivistBTreeImpl is as follows. The file consists of an arbitrary number of compressed segments (using the F4KS method of D/PCedar Compression) followed by an index. Each segment is the result of compressing the ascii representation of the BTree entries for 500 (except in the last segment) files. The index follows the last segment and contains, for each segment, the first file name in the segment and the position in the file where the segment starts. 32 bit arithmetic quantities are stored as Basics.FWORD's for DCedar/PCedar compatibility. The detailed format is
beginning of file (offset 0) --->
seg0 ->   ...
seg1 ->  ...
...
segn ->
end of last segment
fn0 ->    uncompressed first filenames, each terminated by \n
fn1 ->    ...
...
fnn ->
indexStart ->  file offset of beginning of segment 0 (32 bits) seg0
   file offset of uncompressed first file name in segment 0 (32 bits) fn0
   file offset of beginning of segment 0 (32 bits) seg1
   file offset of uncompressed first file name in segment 1 (32 bits)
   .
   .
   .
   file offset of beginning of last segment (32 bits) segn
   file offset of uncompressed first file name in last segment (32 bits)
File.Length[]-8   number of entries in last segment (32 bits)
   file offset of index array (32 bits) indexStart
File.Length[] ----->
Since compression streams do not provide explicit end of stream indications, it is crucial that the reader know how many entries to read out of each segment. Since they are all the same except the last, only the entry count for the last segment is stored in the index.