[_CDCSL_93-16_]<1>Cedar>release>Archives>ArchivesDoc.tioga~

ArchivesDoc.tioga

Tim Diebert: March 13, 1987 10:59:48 am PST

Chauser, October 24, 1991 5:07 pm PDT

Foote, May 16, 1991 11:55 am PDT

ArchivistBTreeDoc

PCEDAR 2.0 — FOR INTERNAL XEROX USE ONLY

Archives For PCedar

Tim Diebert

Carl Hauser

Abstract: This package contains procedures to look at the Cedar Archivist Directory containing all of the names and places relating to the current archives.

Created by: Carl Hauser, from the Cedar 7.0 ArchivistBTree and ListArchives packages by Tim Diebert

Maintained by: Carl Hauser <Carl Hauser:PARC:Xerox>

Keywords: Cedar Archives

XEROX Xerox Corporation
Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, California 94304

For Internal Xerox Use Only

Overview

The Cedar Archivist is a DCedar program that writes files from IFS's to tape and keeps track of where they are. Over the years it has stored about 300,000 files on tape. Each file is written on two tapes, and an entry made in a BTree directory that maps from file names to tape identifications.

Unfortunately, the DCedar BTree is not readable by the PCedar BTree package (problems with byte ordering, etc.). Hence this package was created for PCedar to read (only) a compressed version of the Archivist BTree that is created by the DCedar ArchivistBTree package.

The Archives interface provides procedures for opening, closing and enumerating the directory. (See Archives.mesa.)

UserProfile Entries

Archives.DirectoryName: "/ux/project/archivist/Archives.CompressedDirectory"

Commands

The implementation provides two commands for looking at the directory:

UncompressDirectory

enumerates the entire directory, placing its output, 1 line per file, on cmd.out (e.g. use commander output redirection to place the output in a file);

ListArchives pattern ...

enumerates files matching each of the patterns and places its output on cmd.out in a format similar to that used by the cedar ListFiles command. The patterns should be in brackety syntax. The output format looks like this

[Ivy]<CHauser>Modula-2>*
CursorMouse.DEF!1 25 May 85 16:40:54 PDT T10439P T10439B complete
CursorMouse.MOD!1 17 Sep 85 22:31:26 PDT T10439P T10439B complete
CursorMouse.SBL!1 19 Sep 85 23:47:29 PDT T10439P T10439B complete

^ ^ ^ ^ ^
file name file create date primary secondary state
tape tape

The theory is that with the tape identification information you can go to someone who knows something about the Archivist and get the files back.

Directory format

I really hope you don't need to look at this to use this package!

The compressed directory format supported by this package and the corresponding procedure in DCedar ArchivistBTreeImpl is as follows. The file consists of an arbitrary number of compressed segments (using the F4KS method of D/PCedar Compression) followed by an index. Each segment is the result of compressing the ascii representation of the BTree entries for 500 (except in the last segment) files. The index follows the last segment and contains, for each segment, the first file name in the segment and the position in the file where the segment starts. 32 bit arithmetic quantities are stored as Basics.FWORD's for DCedar/PCedar compatibility. The detailed format is

beginning of file (offset 0) --->

seg0 -> ...

seg1 -> ...

...

segn ->

end of last segment

fn0 -> uncompressed first filenames, each terminated by \n

fn1 -> ...

...

fnn ->

indexStart -> file offset of beginning of segment 0 (32 bits) seg0

file offset of uncompressed first file name in segment 0 (32 bits) fn0

file offset of beginning of segment 0 (32 bits) seg1

file offset of uncompressed first file name in segment 1 (32 bits)

file offset of beginning of last segment (32 bits) segn

file offset of uncompressed first file name in last segment (32 bits)

File.Length[]-8 number of entries in last segment (32 bits)

file offset of index array (32 bits) indexStart

File.Length[] ----->

Since compression streams do not provide explicit end of stream indications, it is crucial that the reader know how many entries to read out of each segment. Since they are all the same except the last, only the entry count for the last segment is stored in the index.