TiogaFileFormat.mesa
Copyright © 1985, 1986 by Xerox Corporation. All rights reserved.
written by Bill Paxton December 1980
last written by Paxton; 4-Mar-82 8:59:40
Michael Plass, March 14, 1985 11:56:00 am PST
Doug Wyatt, September 17, 1986 11:41:48 am PDT
TiogaFileFormat: CEDAR DEFINITIONS
= BEGIN
Op: TYPE = CHAR;
endOfFile: Op = 0C;
startNode: Op = endOfFile+1;
length of formatName in next byte. text for format name follows that.
enter in formatName table
and assign it the next number
startNodeFirst: Op = startNode+1;
startNodeLast: Op = startNodeFirst+numFormats;
these opcodes encode previously seen formatName so don't need to repeat it
find formatName in op-startNodeFirst of format table
other information follows same as for startNode
terminalTextNode: Op = startNodeLast+1;
terminalTextNodeFirst: Op = terminalTextNode+1;
terminalTextNodeLast: Op = terminalTextNodeFirst+numFormats;
these opcodes are for nodes without children so can skip endNode opcode
identical to startNodeFirst..startNodeLast, except implies no children
find format name in op-terminalTextNodeFirst of format table
other information follows same as for startNode
otherNode: Op = terminalTextNodeLast+1;
for "other" format of nodes
length of formatName in next byte. text for format name follows that.
enter in formatName table
and assign it the next number
otherNodeShort: Op = otherNode+1;
like otherNode, but followed by a formatName code number instead of length+text
otherNodeSpecs: Op = otherNodeShort+1;
gives "variety" and specifications for immediately previous "other" format node
length of variety name follows in next byte(s). then text of name
enter in property name table and assign next number
length of specs text in next byte(s)
specs follow that
otherNodeSpecsShort: Op = otherNodeSpecs+1;
like otherNodeSpecs, but followed by a propname code number instead of length+text
prop: Op = otherNodeSpecsShort+1;
specifies property for current node
property specs are stored on file as a rope
length of property name follows in next byte(s). then text of name
enter in property name table and assign next number
length of property specs rope in next byte(s)
followed by text for property specs rope
propShort: Op = prop+1;
like prop, but followed by a propname code number instead of length+text
endNode: Op = propShort+1;
end current node and go back to adding to its parent.
rope: Op = endNode+1;
this op declares rope for most recently started node
length of text for the node in next byte(s).
actual text comes from text block
followed by a CR which is not included in the length
comment: Op = rope+1;
identical to rope except implies text stored in comment area of file
runs: Op = comment+1;
this op preceeds definition of looks for most recently started node
number of runs in following byte(s)
have at most 1 runs op per node. if omit, then no special looks.
looks: Op = runs+1;
looks vector in following 4 bytes.
enter vector in looks table.
and assign it the next number
length of run in next byte(s).
looksFirst: Op = looks+1;
looksLast: Op = looksFirst+numLooks;
these ops encode previously encountered looks so don't need to repeat
find looks in looks table[op-looksFirst].
length of run in the next byte(s).
look1: Op = looksLast+1;
like looks op, except has single look char instead of 4 byte looks vector
look2: Op = look1+1;
like looks1, except has two looks chars
look3: Op = look2+1;
like look2, except has three looks chars
numFormats:
CARDINAL = 70;
max number of entries in the format table for the file
others must be given in long form
FormatIndex: TYPE = [0..numFormats);
numLooks:
CARDINAL = 50;
max number of entries in a look table
others must be given in long form
each entry in a look table is a distinct vector of 32 bits
LooksIndex: TYPE = [0..numLooks);
numProps:
CARDINAL = 50;
max number of entries in a prop table
PropIndex:
TYPE = [0..numProps);
lengths are given in as few bytes as possible according to following scheme
first bit of each byte tells if other bytes follow (up to 4 bytes max)
concat other 7 bits of bytes with leading 0's to form length
thus can represent lengths up to 2**28-1 if use 4 bytes
lengths of 127 or less fit in 1 byte
lengths in [128..4095] fit in 2 bytes
IntBytes:
TYPE =
MACHINE
DEPENDENT
RECORD [
first (0:9..15): [0..127] ← 0, -- first byte written on file
second (0:2..8): [0..127] ← 0, -- second byte written
thirdBottom (0:0..1): [0..3] ← 0, -- bottom bits of third byte written
thirdTop (1:11..15): [0..31] ← 0, -- top bits of third byte written
fourth (1:4..10): [0..127] ← 0, -- fourth byte written
unused (1:0..3): [0..15] ← 0 -- these are always 0
];
LengthByte:
TYPE =
MACHINE
DEPENDENT
RECORD [
unused (0:0..7): [0..255] ← 0,
others (0:8..8): BOOL ← FALSE, -- tells if other bytes are present
data (0:9..15): [0..127] ← 0
];
ThirdByte:
TYPE =
MACHINE
DEPENDENT
RECORD [
unused (0:0..7): [0..255] ← 0,
others (0:8..8): BOOL ← FALSE,
dataTop (0:9..13): [0..31] ← 0, -- top 5 bits of data
dataBottom (0:14..15): [0..3] ← 0 -- bottom 2 bits of data
];
The following defines the trailer bytes that form the Tioga password
commentHeaderId: ARRAY [0..fileIdSize) OF CHAR = [0C,0C];
controlHeaderId: ARRAY [0..fileIdSize) OF CHAR = [235C,312C];
controlTrailerId: ARRAY [0..fileIdSize) OF CHAR = [205C,227C];
fileIdSize: NAT = 2;
numTrailerLengths: NAT = 3; -- <file-props-length> <data-length> <file-length>
endSize: NAT = fileIdSize+numTrailerLengths*4; -- trailer plus three lengths
END.
Tioga file format is as follows:
<file> == <data> <comments> <control>
<data> == characters from (non-comment) nodes in display order with a CR after each node
<comments> == <comments-header> <comments-text>
<comments-header> == <commentHeaderId> <comments-length>
<comments-length> == byte length of <comments>
<comments-text> == characters from comment nodes in display order with a CR after each node
<control> == <control-header> <control-info> <control-trailer>
<control-header> == <controlHeaderId> <control-length>
<commentHeaderId> == <controlHeaderId> == [..bytes..] as defined above
Thus <comments> start with double 0's which can serve to mark end of <data>.
<control-length> == byte length of <control>
<control-trailer> == <file-props> <controlTrailerId>
<file-props-length> <data-length> <file-length>
<file-props-length> == gives length of <file-props>
<file-prop> == <propIdByte> <valueLengthByte> <propValueBytes>
<controlTrailerId> == [..bytes..] as defined above
<comments-start> == byte address in file for start of comments
<control-info> == sequence of control bytes and control text.
The last thing in the file is the <control-trailer>. Thus to read a file, first
check the <controlTrailerId> and read the byte address for the <data-length>.
We also read the saved <file-length> as a double check.
The saved <data-stamp> is a 4 byte checksum of the data part of the file.
This lets us read the <comments-header> where we double check to make sure we
have a valid file by reading the <commentHeaderId> and the <comments-length>.
Then we go to the <control-header>, checking the <controlHeaderId> and reading
the <control-length>. Finally, we check that
<control-length>+<comments-length>+<data-length> = <file-length>.
The <data>, <comments>, and <control> are written node by node starting at the root.
For each node the following steps are taken:
1. Output a start node control byte of the appropriate sort.
either startNode, terminalTextNode, otherNode, otherNodeShort, or
something in the startNodeFirst..startNodeLast range or
in the terminalTextNodeFirst..terminalTextNodeLast range.
2. Output the properties, if any, of the node.
output the property name with prop or propShort
then output the property value
text for props is put in <control> rather than <data>
3. Output the contents of the node.
for a text node, first write the runs info, if any, and then the rope.
if it is a comment node, then the text goes in comments,
else it goes in data.
for other varieties of node, write otherNodeSpecs or otherNodeSpecsShort
followed by the text of the specifications.
4. Output the children of the node, if any, in the same manner
5. Output an endNode byte, unless this is a terminal text node
Following the end of the root node, there is an endOfFile byte.