Page Numbers: Yes X: 530 Y: 10.5" First Page: 1500 Not-on-first-page
Columns: 1 Edge Margin: .6" Between Columns: .4"
Margins: Top: 1.3" Bottom: 1" Binding: -5
Line Numbers: No Modulus: 5 Page-relative
Odd Heading: Not-on-first-page
Towards an Interchange Standard for Editable Documents
Even Heading: Not-on-first-page
Towards an Interchange Standard for Editable Documents
Towards an Interchange Standard for Editable Documents
by Jim Mitchell and Jim Horning
May 4, 1982 11:42 AM
File: Interdoc-1.5.bravo
Edited by Mitchell, September 18, 1981 7:16 PM, folded in JJH’s comments and altered universal ids, comments, tags.
Edited by Horning, May 4, 1982 1:48 PM, folded in Truth copy comments.
1.5.Introduction to the Interdoc Base Language
This section is intended to lead the reader through a set of examples, to show what the language looks like and how it is used to represent a number of commonly occurring features of editable documents. The examples purposely use rather long identifiers and lots of white space to make them more readable. In actual use, programs, not people, will generate and read scripts; names will tend to be short and logically unneeded spaces and carriage returns will tend to be omitted.
1.5.1. Simple text as a document
The following script defines a document consisting of the string "The text of the main node of example 1.5.1"; no font, paragraph structure, or formatting information is supplied. This example will gradually be expanded to represent accurately figure 1.5.1. The numbers at the left margin do not form part of the script; they are used to refer to the various lines in the discussion below.
0Interdoc/Interchange/1.0
1
{<The text of the main node of example 1.5.1>}
Line 0 is the header denoting version 1.0 of the interchange encoding. Line 1 is the entire body of this script: it contains a single node enclosed in {} which in turn contains a single string value enclosed in <>.
The text of the main node of example 1.5.1
The text of the first subnode of example 1.5.1
Example 1.5.1: A simple document
The next version of the example adds the tag, TEXT$ to the node. The identifier TEXT is called a universal name (or atom), which is indicated by its being composed of all uppercase letters. Universal names have no definition within the base language (they are expected to be defined in Layers 2 and 3).
0Interdoc/Interchange/1.0
1
{TEXT$
2
<The text of the main node of example 1.5.1>
3
}
A tag is denoted by placing "$" after a universal name. A node’s tags are strictly local (they are not inherited by other nodes in the script) and serve as "type information" about the node. The tag TEXT$ labels this node as one that can be viewed as textual data. Tags also create implicit indirections; see section 1.6.5.
0Interdoc/Interchange/1.0
1{PARAGRAPH$
2leftMargin←3.25*inch rightMargin←5.0*inch
3<The text of the main node of example 1.5.1>
4}
This example shows how auxiliary information, such as margins, may be associated with a node of a script. The binding leftMargin←3.25*inch adds the attribute leftMargin to the node’s environment and binds the value of the expression 3.25*inch to it (inch is a constant whose dimensions are inches/meter; meters are the standard Interdoc units of distance). The bindings to leftMargin and rightMargin convey the fact that this node has margins for display. To denote the change in character of the node, we have tagged it as PARAGRAPH instead of TEXT. Figure 1.5.1 uses these margins for its first line of text.
0Interdoc/Interchange/1.0
1{PARAGRAPH$
2leftMargin←3.25*inch rightMargin←5.0*inch
3<The text of the main node of example 1.5.1>
4{PARAGRAPH$ leftMargin←+0.5*inch
5<The text of the first subnode of example 1.5.1>
6}
7}
We have further elaborated the example by nesting another text node in the primary one, with its text following the primary node’s text and with an indented leftMargin. The binding leftMargin←+0.5*inch is a contraction of leftMargin←leftMargin+0.5*inch. The right side of the binding is evaluated, and since there is as yet no binding in the inner node’s (lines 4–6) environment for leftMargin, it is looked up in the environment of the containing node (lines 1–3). The value of the right hand side expression is thus 3.75*inch. This value is then bound to the identifier leftMargin in the inner node’s environment. Since no value is bound to rightMargin in the inner node’s environment, it will have the same rightMargin as its parent node.
0Interdoc/Interchange/1.0
1p=’PARAGRAPH$ leftMargin←3.25*inch rightMargin←6.0*inch’
2{p rightMargin←5.0*inch
3<The text of the main node of example 1.5.1>
4{p leftMargin←+0.5*inch
5<The text of the first subnode of example 1.5.1>
6}
7}
One can also define an abbreviation by binding a sequence of unevaluated expressions to an identifier and subsequently using the identifier to cause those expressions to be evaluated at the point of invocation. This example binds the quoted expression ’PARAGRAPH$ leftMargin←3.25*inch rightMargin←6.0*inch’ to the identifier p. The binding operator is = instead of to denote the fact that this binding may not be superseded in this node or any of its subnodes; for this reason such a binding is called a definition. When p is invoked in lines 2 and 4, the quoted expression replaces the invocation and is evaluated there.
Invoking p places the tag PARAGRAPH$ on the node, sets the leftMargin to 3.25*inch and the rightMargin to 6.0*inch. In line 2, the rightMargin is then rebound to 5.0*inch, overriding the default binding created by invoking p. Similarly, the binding for leftMargin in line 4 overrides the one resulting from invoking p, resulting in its leftMargin being 3.75*inch and its rightMargin being 6.0*inch.
An identifier can also be bound to an environment value as a convenient record-like manner of naming a set of related bindings. For example, a font might be defined as follows (a more complete definition is given later in section 1.6.3):
font = [ | family←TIMES size←10*pt face←[ | weight←NORMAL style←ROMAN slant←NIL] ]
This defines font to be the environment formed by taking the empty or Null environment and altering it according to the series of bindings following the initial "[ |." In this case font is an environment having bindings for three attributes, family, size, and face. face is itself bound to an environment (with attributes weight, style, and slant). Since font is bound using "=", it cannot directly be changed in its scope, although its components can be since they are bound using "←". The set of default bindings in font specify a normal weight (non-bold), non-italic Times Roman 10 point font.
We can incorporate this font definition in the example and then use it to indicate that the word "first" in the subnode should be in italics:
0Interdoc/Interchange/1.0
1
p=’PARAGRAPH$ leftMargin←3.25*inch rightMargin←6.0*inch’
2
font = [ | family←Times size←10*pt face←[ | weight←NORMAL style←ROMAN slant←NIL] ]
3
{p rightMargin←5.0*inch
4
<The text of the main node of example 1.5.1>
5
{p leftMargin←+.5*inch
6
<The text of the >
7
font.face.slant←ITALIC <first> font.face.slant←NIL
8
< subnode of example 1.5.1>
9
}
10
}
Bindings affect node contents to their right: so, "first" will be italic, while " subnode of example 1.5.1" will be non-italic due to the binding immediately preceding it. If we expected to switch between italics and non-italics frequently, it might be profitable to introduce abbreviations to shorten what must appear. For example, in the scope of the definition
l=[ | i=’font.face.slant←ITALIC’ nI=’font.face.slant←NIL’]
line 7 could be abbreviated
l.i<first>l.nI
1.6.Further Examples
This section gives some more realistic examples of the use of the Interdoc language and explores the issues of making sets of standard definitions for use in scripts.
1.6.1. A Laurel Message
Here is a possible Interdoc transcription of a Laurel message:
0Interdoc/Interchange/1.0-- standard heading --
1
{LAURELMSG$-- tag for a Laurel document --
2
Sub=’PARAGRAPH$ leftMargin←1.0*inch rightMargin←7.5*inch’
3
justified←F-- "←" means overridable default --
4
font.family←TIMES font.size←10
5leading.x←1
6
leading.y←1-- overridable default leadings --
7
heading@! -- declare a label --
8
laurelInfo =-- Laurel information for easy access; none is changeable --
9
(Heading.time@ Heading.from@ Heading.subject@ Heading.to@ Heading.cc@)
10
{<Date: > {Heading.time! <18 June 1981 9:18 am PDT (Thursday)>}
11
<From: > {Heading.from! <Mitchell.PA> AUTHENTICATED$}
12
<Subject: > {Heading.subject! <A Sample Document Syntax>}
13
<To: > {Heading.to! <Horning.PA>}
14
<cc: > {Heading.cc! <Mitchell, Interdoc.PA>}}
15
leading.y←6-- override outer y leading --
16
{<text of paragraph1>}-- node which is a paragraph --
17
{<text of paragraph2>}
18
{<text of paragraph3>}
19
}
Line 1 tags this document (by tagging its root node) as a Laurel message, and line 2 tags its subnodes (starting on lines 10, 16, 17, and 18) as paragraphs with default margins. Lines 3–6 bind some other attributes, likely to be relevant to paragraphs. Line 7 declares the main link identifier heading, and lines 8–9 bind to laurelInfo a vector of source links whose targets are the parts of the document of interest for mail transport. Lines 10–14 have similar structures: each consists of a string followed by a node containing a target link for the label heading and text for that Laurel "field." Line 11 is additionally tagged as AUTHENTICATED. Lines 16–18 contain paragraphs constituting the body of the message.
Alternatively, the external environment might well contain a definition of laurel60 that establishes a suitable environment for a Laurel 6.0 document:
1laurel60= ’
2time@! from@! subject@! to@! bodyNodes@! cc@!
3LAURELMSG$
4cr = <#13#> tab = <#9#>
5p=’PARAGRAPH$ leftMargin←1.0*inch rightMargin←7.5*inch’
6
justified←F
7font.family=TIMES font.size=10
8margins.left←2540 margins.right←19050
9leading.x←1 leading.y←1-- overridable default leadings --
10
printForm=
11 ’{p <Date: > time@ tab
12 <From: > from@ cr
13 <Subject: > subject@ cr
14 <To: > to@
15 leading.y←6
16 bodyNodes@
17 <cc: > cc@
18 }’
19heading = ’LAURELHEADING$ Sub←’TEXT$ LAURELFIELD$’ ’
20body = ’Sub←’p bodyNodes!’ ’
21
One advantage of using source labels for the "bodies" of the To:, From:, etc. fields (lines 11–14, 17) is that they can represent sets of nodes as well as single nodes.
Now the Laurel document would be described by the following script:
22Interdoc/Interchange/1.0-- standard heading --
23
{laurel60% -- invoke Laurel 6.0 definitions
24
{heading%-- invoke heading style --
25
{time! <18 June 1981 9:18 am PDT (Thursday)>}
26
{from! AUTHENTICATED$ <Mitchell.PA>}
27
{subject! <A Sample Document Syntax>}
28
{to! <Horning.PA>}
29
{cc! <Mitchell, Interdoc.PA>}
30
}
31
{body%-- Invoke body style --
32
{<text of paragraph1>}
33
{<text of paragraph2>}
34
{<text of paragraph3>}
35
}
36
}
Invoking laurel60 in line 23 introduces the quoted expressions heading and body into the root node’s environment, tags it as LAURELMSG and declares the labels time, from, etc. It also acquires a definition for a print form, which could be used to format the message for sending to a printer. The "%" (indirection) operator indicates that this is intentional structure, to be preserved by each rendition, rather than merely an abbreviation. Thus the message heading and body should "see" the effects of any future changes made to laurel60, by editing its definition. By contrast, p is used as an abbreviation; when the script is rendered, its value may safely be copied at each use.
Look at the definition of heading (line 19): the right side is a quoted expression sequence. The first expression of the sequence produces the tag LAURELHEADING$ and the second binds the quoted expression ’TEXT$ LAURELFIELD$’ to Sub. As a result, each subnode of the one beginning on line 24 will be initialized by invoking Sub from its containing node, which gives each the tags TEXT$ and LAURELFIELD$.
Similarly, the definition of body (line 20) defines Sub, and the nodes on lines 32–34 will be initialized by invoking p and having the target link bodyNodes placed on it. Labelling the set of body nodes this way means that the source link, bodyNodes@, in printForm (line 19) denotes the entire sequence of body nodes, in left-to-right depth-first tree order.
1.6.2. A page of a Star document
This example is taken from page 71 of the Star Functional Specification and shows one page of a paginated document with a diagram and a footnote (we recommend that you have that page in front of you when analyzing this transcription):
-- pages 1 .. 6 supposedly precede this one, and Sub would be set to PAGE$ to take care of page heading, page number, etc. --
{pg.a7!
Sub←’PARAGRAPH$’
{<Many of these conclusions are based on prior experience>
{fn.n1!-- just a unique label: fn! introduced in PAGE --
FOOTNOTE$
<See the 1970 report titled "Organizational Changes and Sales Margin" and other documents referenced in that document
. Further reports are available if you need them.>
}
< which has shown our techniques to be valid. Other data can be collected by future changes to your accounting and billing packages, which will allow us to perform even better analyses and lead to better problem discovery and correction.>
}
{<The results of the sales analysis suggest that certain organizational changes can improve the overall efficiency of the operation. The March figures, in particular, bear this out. You will note below a suggested change that we feel will correct the problems noted in the analysis above.>
}
Sub←’FRAME$’-- change to subnode tag FRAME --
{Alignment.horizonally←FlushLeft Alignment.vertically←Floating
height←2.8*inch width←3.67*inch
edges.expandingRightEdge←T
border←dots1
-- change to default subnode environment Rectangle with solid, double width outline --
Sub←’RECTANGLE$ lineType.width←2 lineType.style←solid’
rect@!
-- declare label class to be used below --
{rect.a1! UpperLeft←(.0254 .07)
shading←7 height←.01 width←.027{Title <Headquarters>} }
{rect.a2! UpperLeft←(.073 .015)height←.01 width←.018{Title <Staff Support>} }
height←.013-- attribute value shared by following subnodes
{rect.a3! UpperLeft←(.02 .03)width←.025{Title <Development>} }
{rect.a4! UpperLeft←(.02 .03)width←.028{Title <Manufacturing>} }
{rect.a5! UpperLeft←(.042 .055)width←.016{Title <West Coast>} }
{rect.a6! UpperLeft←(.067 .055)width←.016{Title <East Coast>} }
-- default subnode environment is LINE with solid, double width outline --
Sub←’LINE lineType.width←2 lineType.style←solid’
ln@!
{ln.out1!rect.a1@ln.in34@}
{ln.out2!
rect.a2@ln.out1@}
{ln.in3!
ln.in34@rect.a3@}
{ln.in4!
ln.in34@rect.a4@}
{ln.in34!
ln.in3@ln.in4@}
{ln.out4!
rect.a4@ln.in56@}
{ln.in56!
ln.in5@ln.in6@}
{ln.in5!
ln.in56@rect.a5@}
{ln.in6!
ln.in56@rect.a6@}
}-- end of Frame1 --
Sub←’PARAGRAPH$’
-- restore default subnode initialization to PARAGRAPH --
{<The process of switching to this new organization will not be an easy one
. However, the reports seem to suggest many reasons why it should not be postponed. In particular, the separation of Manufacturing from Development should have significant impact.>}
{<Also, we feel strongly that merging East and West Coast Development will help. As we have suggested in past reports, there has always been considerable replication of effort due to this geographic separation. You will recall the events leading up to the initial contract with our firm.>}
}-- end of page --
1.6.3. Some Star property sheets
Here a few of the definitions invoked in the above example (these were derived from page 148 of the Star Functional Specification). Some of them simply give default values for various attributes; some, like default.font, define a collection of related attributes as an environment; and most are quoted expression sequences for providing abbreviations or "decorating" nodes with tags and their environments with relevant attributes. These definitions would exist in the external environment for Star–produced scripts. They would be made accessible to other editors as part of the definition of XEROX.STAR.VERSION1.
1.6.3.1. Font-related defaults and definitions
baseline←0-- the base line for characters --
underlined←F-- whether or not text in node is to be underlined --
strikeOut←F-- whether or not text in node is to have strike-out line through it --
-- there is no rhyme and little reason behind the names of type fonts. The following definition is intended to provide enough choice, using standard "terms" to name any existing font in an arbitrary font catalog (of course, it doesn’t, but perhaps it is close enough) --
default.font = [ |
-- Definition --
family←Times-- a font family name --
face←[ |-- Definition --
weight←NORMAL-- In (EXTRALIGHT, LIGHT, BOOK, NORMAL, MEDIUM,
DEMIBOLD, SEMIBOLD, BOLD, EXTRABOLD, ULTRABOLD,
HEAVY, EXTRAHEAVY, BLACK, GROTESQUE) --
lineType←SOLID-- In (SOLID, INLINE, OPEN, OUTLINE, DISPLAY, SHADED) --
proportions←NORMAL-- In (NORMAL, CONDENSED, EXPANDED, EXTENDED,
WIDE, BROAD, ELONGATED) --
style←ROMAN-- In (ROMAN, GOTHIC, EGYPTIAN, CURSIVE, SCRIPT) --
slant←NIL-- In (NIL, ITALIC, OBLIQUE) --
swash←F-- T => use swash capitals --
lowercase←T-- T => use lowercase letters --
uppercase←T-- T => use uppercase letters --
smallCaps←F-- T => use small capitals --
]
size←10*pt-- distance --
]
-- some useful font shorthands: --
Helvetica = ’font ← [default.font% | family←HELVETICA]’
Italic = ’font.face.slant←ITALIC’
Bold = ’font.face.weight←BOLD’
Helvetica10BI = ’Helvetica font.size←10*pt Bold Italic’
1.6.3.2. Page-related definitions
PAGE = ’fn@! fnCount:0-- introduce link class and count for footnotes on this page --
PageControl←NEWPAGE-- In (CONTINUE, NEWPAGE, NEWLEFTPAGE,
NEWRIGHTPAGE, NEWCOLUMN) --
height←11*inch width←8.5*inch
margins←[ |
left←1*inchright←1*inchbinding←0.5*inch
top←1*inchbottom←1*inch
]
numberOfColumns←2
spaceBetweenColumns←.5*inch
FOOTNOTE = ’fnCount:=+1 font.size←8*pt FootnoteRef%’
FootnoteRef = ’{FOOTREF$ baseline←+5*pt fnCount}’-- raise 5 pts --
1.6.3.3. Paragraph-related definitions
Tab = [ |
position←0
type←LEFT-- In (LEFT, CENTERED, RIGHT, DECIMAL) --
]
MakeTabs=’n←0 tabs←(RecursiveMakeTab[Value])’
RecursiveMakeTab=’(EQ[Value 0] | NIL | n←+.25*inch [Tab | position←n ] RecursiveMakeTab[Value-1])’
PARAGRAPH = ’Indent = [ | Left←0.0 Right←0.0]-- distance --
Alignment←FLUSHLEFT-- In (FLUSHLEFT, FLUSHRIGHT, BOTH, CENTERED) --
Justified←F
leading←[leading | between←1*pt above←12*pt below←0]
charStyle←[|
Normal←’font←default.font’
Emphasis1←’font←default.font Italic’
Emphasis2←’font←default.font Bold’
]
Hyphenation←F
KeepOn←NIL-- In (NIL, SamePageAsNextParagraph) --
MakeTabs[8]-- binds tabs to a sequence of 8 tabs (0, .25 inch, .50 inch, . . .) --
charStyle.Normal-- initializes to normal style
1.6.3.4. frame, rectangle, and line definitions
Def.UpperLeft = ’UpperLeft←(0.0 0.0)’-- Def is just a convenient environment in which to put useful auxiliary definitions --
Def.lineType = ’
lineType←[ |
Visible←T
Width←1
Style←SOLID]-- IN (SOLID, DOT, DASH, DOTDASH, DOUBLE, . . .) --
Def.Shading = ’Shading←0’
Def.Box = ’Def.UpperLeft Def.lineType Def.Shading’
Frame = ’FRAME$ Def.Box’
Rectangle = ’RECTANGLE$ Def.Box
Constraint←MagnifyOnly-- IN (NIL MagnifyOnly) --
Def.LineEnd = ’
LineEnd←(LeftUpper←Flush RightLower←Flush)-- IN (Flush Round Square arrow1 arrow2 arrow3) --
Line = ’LINE$ constraint←FixedAngle Def.lineType Def.LineEnd’
Title = ’CAPTION$ Paragraph’
1.6.4. Using links
Links are intended to provide the means for associating nodes in non-hierarchical ways. They can be used for referring to figures, examples, tables, etc., for describing tables of contents, for denoting index items, keeping lists, etc.
1.6.4.1. References to figures
The following outlines how the labelling facilities and persistent bindings can be used to generate references to (source links for) a figure whose number may not be known at the point of reference. The identifier n5 is assumed to have been generated by the program that produced the script and is assumed to be unique over the target labels with naming prefix "figures." in the script.
figures@! figCount: 0-- should appear in a script’s root node --
makeFigureNum = ’HIDDEN figCount:=+1 figCount’
{. . . figures.n5@ . . .}-- ref to node with label figures.n5! --
{ . . . {figures.n5! makeFigureNum} . . .}-- a hidden node holding the figure number --
The node in which the figure number for figure n5 is defined contains a tag, HIDDEN, which means that the node is not to be considered a part of the dominant structure for display purposes even though it is part of it. The node’s sole content is the value of figCount after it has been (persistently) incremented by 1. Because figCount is bound with ":=", the scope of the binding extends beyond the end of the node in which it occurs.
1.6.4.2. Collections of index items
Assume that the word "framble" is to be considered an index item in certain places where it occurs in a document. The link class Indexable@! should be introduced at the root of the document, and each to-be-indexed occurrence of "framble" in a string, e.g., <When a framble is found, it . . .>, should be replaced by the sequence <When a > framble < is found, it . . .>. Somewhere in the script within the scope of the declaration of Indexable, at the root of a subtree containing all the uses of framble should be the following definition:
framble=’{HIDDEN$ indexable.framble! pageNumber} <framble>’
Invoking framble results in the appearance of a hidden node containing the current page number (assumed to be held in the attribute pageNumber) and labelled as being in the set of target links indexable and indexable.framble. The index for the document might then contain the following entry for "framble":
{INDEXENTRY$ <framble> indexable.framble@}
This entry contains the minimal information needed to generate the sequence of page numbers corresponding to indexable occurrences of framble. If some occurrences are considered primary and some secondary, then these mechanisms can be generalized to have framble defined as
framble=[ | primary=’{HIDDEN$ indexable.framble.primary! pageNum} <framble>’
secondary=’{HIDDEN$ indexable.framble.secondary! pageNum} <framble>’]
Primary references are denoted in the script as framble.primary and secondary ones as framble.secondary. Similarly, the index entry takes the form:
{INDEXENTRY$ <framble> indexable.framble.primary@ indexable.framble.secondary@}
1.6.5. Using indirections
Indirections provide a way to centralize (and delay) the binding of information within a document. They can be used to share information that is intended to be consistent.
1.6.5.1 Styles and style sheets
Documents generally follow stylistic conventions for presenting different kinds of content. E.g., major headings may be in bold face with twelve points of extra leading, minor headings in italic with six points of extra leading. If this information is explicitly bound for each piece of content, then a stylistic change may require locating and changing all the relevant bindings (note that italic is likely to be also used for other purposes, such as emphasis). If, however, the binding is done indirectly, through a style, a single change will be effective for all places where the style is referenced. Note that each occurrence of a tag implicitly establishes an indirection through the same identifier; this is convenient in associating styles with semantically meaningful tags. For example:
MajorHeading = ’PARAGRAPH$ Bold leading←+12’
MinorHeading = ’PARAGRAPH$ Italic leading←+6’
1.6.5.2 Technical terms
Terminology may be undergoing change while a document is in production. For example, the previous version of this document used "mark" for what is now called "tag." One way to defer decisions on terminology, while ensuring that each version of the document is self-consistent, is to use an indirect reference for each occurrence of a term that may have to be rebound later.
HISTORY LOG
Edited by Mitchell, September 1, 1981 3:12 PM, added first version of glossary
Edited by Mitchell, September 7, 1981 2:11 PM, wrote parts of introduction
Edited by Mitchell, September 10, 1981 10:14 AM, added Tab def to Star property sheets
Edited by Mitchell, September 14, 1981 9:54 AM, renumbered chapters and did minor edits
Edited by Mitchell, September 17, 1981 1:37 PM, folding in JJH’s edits.
Edited by Mitchell, September 18, 1981 12:45 AM, added considerable annotation of examples.
Edited by Horning, May 4, 1982 12:30 PM, Fold in Truth Copy edits