Page Numbers: Yes X: 530 Y: 10.5" First Page: 3000 Not-on-first-page
Columns: 1 Edge Margin: .6" Between Columns: .4"
Margins: Top: 1.3" Bottom: 1" Binding: -5
Line Numbers: No Modulus: 5 Page-relative
Odd Heading: Not-on-first-page
Towards an Interchange Standard for Editable Documents
Even Heading: Not-on-first-page
Towards an Interchange Standard for Editable Documents
Towards an Interchange Standard for Editable Documents
by Jim Mitchell and Jim Horning
September 18, 1981 7:30 PM
File: Interdoc-3ff.bravo
Edited by Mitchell, September 18, 1981 12:57 AM, folded in JJH’s edits.
Edited by Horning, May 4, 1982 4:50 PM, folded in Truth copy edits.
3. Higher-Level Issues
3.1. Standard and Editor-Specific Transcriptions:
We need a two-level structure for documents expressed in the base language to be both (a) interchangeable among different editors, and (b) retain information of special significance to a specific editor. We call (a) the interchange standard information, or standard information and (b) editor-specific information.
Basically, an editor X is free to couch properties in its own terms, which can make it easy for it to consume an Interdoc script produced by itself, but it must provide a set of mappings which will transform properties into the interchange standard. The recommended method for doing this is to invoke its name as the very first item in the root node of any X-specific subtree. The rules for inheritance of properties mean that often only the root node of a document will need to have this property, but there is nothing wrong with nodes being in different editor-specific terms provided they invoke the appropriate editor properties.
Now, to be a valid standard script, the document must have the definition of the name X placed in the script itself (There is nothing wrong with having libraries of editor-specific standard mappings in a library of some sort to avoid having copies of them in each script).
When X parses an X-specific script, it will use its X-specific attributes and never invoke the mappings from X-specific information to standard terms; i.e., it can use a null definition for the name X. However, when such a document is interpreted by some other editor Y, any time it tries to access a standard name, the mapping from that name to the corresponding expression in terms of the X-specific values in the script will have been provided by the definition of X. What guarantee is there that this can always be done?
It is worth noting first that we are speaking here of a script being rendered for an editor, rather than produced. Consequently, it will never be necessary to access standard names in left-hand contexts; i.e., to do bindings that are not part of the script in order to interpret it. It may, however, need to access the components of environments in order to render the script into its private format. These are always values in right-hand side contexts, and must be computed in terms of the X-specific information that X put in the script. We can examine this issue on a case-by-case basis. Below is a list of examples of possible editor-specific uses of the base language and the mappings that would allow another editor to treat the document in standard terms:
Symbolic values used instead of numbers: supply standard values for the symbolic values:
Standard:
leading.between ← 1*pt-- some numeric value --
Editor-specific:
leading.betweenLines ← single
leading.above ← double
mapping:
single = 2*pt
double = 4*pt
Different names used for standard names: supply a binding to the standard name from the editor-specific name using a quoted expression so that it is only evaluated when needed in a righthand context:
Standard:
leading[above←10*pt between←2*pt below←0*pt ]
Editor-specific:
Space [BetweenLines←single BeforePara←double AfterPara←single]
mapping:
leading.between’Space.BetweenLines’
leading.above’Space.BeforePara’
leading.below’Space.AfterPara’
Different concepts used for standard ones: supply a binding to the standard attribute names from the editor-specific concepts using quoted expressions so that they are only evaluated when needed in righthand contexts:
Standard:
leading ← [above←10*pt between←2*pt below←0*pt ]
Editor-specific:
Spacing ← [fontSize←10 on←14 leading←1]-- all units assumed to be pts --
mapping:
leading.between’pt*Spacing.onSpacing.fontSize’-- value is in pts --
leading.above’pt * Spacing.leading+Spacing.onSpacing.fontSize’
leading.below0
In general, one can use the facilities of the base language to write essentially arbitrary programs that can, by being quoted, be bound to a standard identifier to cause the appropriate value to be computed based on editor-specific information put in the document by the editor that produced it. Moreover, since the mappings provided by editor X can be overridden in any subtree of the document, an editor that does not "understand" some subtree of a document produced by another editor Y can simply leave that subtree intact when producing an edited version of the original script except to ensure that that subtree’s root node’s first expression is an invocation of "Y", which will cause Y’s editor-specific mappings to obtain in that subtree.
3.2. Standard External Environment
It will be important to provide for a standard external environment for rendering scripts so that standard definitions need not be carried along with every script that uses them. The external environment will contain definitions for units (inch, pt, etc.), various "styles" (para, figure, etc.), and useful abbreviations (italic, bold, etc.).
3.2.1. Units
The Interdoc standard assumes that distances are in meters and angles are in degrees. Using the language and the following constants defined in the standard external environment, a script can readily express distances and meters in other, possibly more convenient units:
meter=1.0-- IN TERMS OF METERS --
mica=1.E-5*meter
-- mica= 1.E-5 --
inch=.0254*meter
-- inch= .0254 --
pt=.013836*inch
-- pt= .00035143 --
pica=12*pt
-- pica= .00421752 --
tenPitch=inch/10
-- tenPitch= .00254 --
twelvePitch=inch/12
-- twelvePitch= .00211667 --

degree=1.0
-- ANGLES ARE IN DEGREES --
pi=3.14159265
radian=180*degree/pi
-- = 57.29577951 --
4.Pragmatics
Private encodings and private representations
Conversion efficiency
Implementation considerations
APPENDIX A

G
LOSSARY
An italicized word in a definition is defined in this glossary.
abbreviationAn invocation used to shorten a script, rather than to indicate structure
attributeA component of an environment, identified by its name, which is bound to a value
base languageThe part of the Interdoc language that is independent of the semantics of particular properties and attributes
base semanticsThe semantic rules that govern how scripts in the base language are elaborated to determine their contents, environments, and labels
bindingThe operation of associating a value with a name to add an attribute to an environment; also the resulting association
binding modeA value may be bound to an identifier as const, var, local, or persistent
BooleanAn enumerated primitive type (F, T) used to control selection and as primitive values
const bindingA binding of an attribute that prevents its being rebound in any contained scope
contentsThe vector of values denoted by a node of a script
definitionAnother name for a const binding
documentThe rendition of a script in a representation suitable for some editor
dominant structureThe tree structure of a document corresponding to the node structure of its script
editor-specific nameA non-standard name used by a specific editor in scripts it generates; an editor may use editor-specific terms without interfering with the interchangeability of a script if it provides definitions of the standard names in terms of its editor-specific names
elaborate(verb) To develop the semantics of a script or a node of a script according to the Interdoc semantic rules. This is a left-to-right, depth-first processing of the script
encodingA particular representation of Interdoc scripts
environmentA value consisting of a set of attributes
expressionA syntactic form denoting a value
external environmentA standard environment relative to which an entire script is elaborated
fidelityThe extent to which a transcription or rendition preserves contents, form, and structure
hexIntA component of an intSequence formed from a pair of letters in the set {A,B,. . .,O,P}, representing an integer 0 .. 255
hierarchical nameA name containing at least one period, whose prefix unambiguously denotes the naming authority that assigned its meaning
identifierA sequence of letters used to identify an attribute
integerA mathematical integer in a limited range; one of the primitive types
interchange encodingA standard encoding of Interdoc scripts
InterdocThe current name of this basis for an editable document standard
intSequenceAn abbreviated notation for sequences of small integers
invocationThe appearance of a name in an expression, except as the attribute of a binding
labelA tag, or a source, a target, or a link introduction placed in a node
linkThe cross product of a source and a target; in general, a link is a set of (source, target) pairs; in the special case when there is exactly one source and one target, a link behaves like a directed arc between a pair of nodes
link introductionThe appearance of id@! in a node, where id is the main identifier of a link
literalA representation of a value of a primitive type in a script
local bindingA binding of a value to a name, causing the current environment to be updated with the new attribute; any outer binding’s scope will resume at the end of the innermost containing node
nameA sequence of identifiers internally separated by periods; e.g., a.b.c
nested environmentThe initial environment of a node contained in another node
NILA name for the empty value; it does not lengthen a vector or node in which it appears
nodeEverything between a matched pair of {}s in a script; this generally represents a branch point in a document’s dominant structure
NullIdentifies the empty environment; the value it associates with any identifier is NIL
OuterA standard attribute of every environment, whose value is the environment just prior to the start of the current node
OutermostThe standard outer environment for an entire script; the value of an identifier in Outermost is the universal consisting of the same letters in upper case
persistent bindingA kind of binding within the scope of a var binding that acquires the scope of the var binding, and hence may endure beyond the end of the innermost containing node
primitive typeBoolean, Integer, Real, String, or Universal
primitive valueA literal or a node, vector, or environment containing only primitive values
private encodingOne of a number of non-standard encodings of an Interdoc script
propertyEach tag on a node labels it with a property; the properties of a node determine how it may be viewed and edited
quoted expressionA value which is an expression bracketted by single quotes ("’"); the expression is evaluated in each environment in which the identifier to which it is bound is invoked
realA floating point number
renditionThe process of converting from a script to a document; also the result of that process
scopeThe region of the script in which invocations of the attribute named in a binding yield its value; the scope starts textually at the end of the binding, and generally terminates at the end of the innermost containing node
scriptAn Interdoc program; the interchangeable result of transcribing a document
selectionA conditional form in a script that denotes one of two expressions, depending on the value of a Boolean expression in the current environment
sourceThe set of nodes labelled with link@
stringA literal which is a vector of characters bracketed by "<>", e.g., <This is a string!>
styleA quoted expression to be invoked in a node to modify the node’s environment, labels, or contents
SubA standard component of each environment, which is invoked to initialize nested environments
SUBSCRIPTA function that can be used to extract a value from a vector, e.g. SUBSCRIPT[(a b <str>), 3] is the value <str>
tagA universal name labelling a node using the syntax universal$; the properties of a node correspond to the set of tags labelling it
targetThe set of nodes labelled with link!
transcriptionThe process of converting from a document to a script; also the result of that process
transparencyA characteristic of Interdoc scripts that allows an editor to identify the nodes of a script that it understands and thereby enables it to operate on those nodes without disturbing the ones that it doesn’t understand
UnitsA set of definitions relating various typographical and scientifc units to the Interdoc standard units, meters; e.g., inch=.0254 pt=.013836*inch
universalA name whose first identifier is all uppercase; a universal name can be used at the top level in the external environment, e.g., XEROX.fonts.Helvetica
valueA primitive value, node, vector, environment, universal, or quoted expression
var bindingA binding that is intended to be superseded by persistent bindings within its scope; useful for maintaining such things as running figure numbers
vectorAn ordered sequence of values that may be subscripted
APPENDIX B

A
RBITRARY CHOICES
"One of the primary purposes of a standard
is to be definitive about otherwise arbitrary choices."
There are many places in this proposal where we have made an arbitrary choice for definiteness. It will be important that the ultimate standard make some choice on these points; it matters little whether it is the same as ours. To forestall profitless debate on these points, we have tried to list some of the choices that we believe can be easily changed at a later date:
Encoding choices:
The choice of representations for literals (we generally followed Interpress here).
The selection of particular characters for particular kinds of bracketting, and for particular operators.
The choice of infix and functional notation for the interchange encoding (as opposed, e.g., to Polish postfix).
The choice of particular identifiers for basic concepts.
Linguistic choices:
The choice of a particular set of basic operators for the language.
The particular set of primitive data types (we followed Interpress—its set seems about as small as will suffice).
The choice of particular syntactic sugars for common linguistic forms.
APPENDIX C

R
ELATION TO OTHER STANDARDS
APPENDIX D

H
ISTORY LOG
Edited by Mitchell, September 1, 1981 3:12 PM, added first version of glossary
Edited by Mitchell, September 7, 1981 2:11 PM, wrote parts of introduction
Edited by Mitchell, September 10, 1981 10:14 AM, added Tab def to Star property sheets
Edited by Mitchell, September 14, 1981 9:54 AM, renumbered chapters and did minor edits
Edited by Horning, May 4, 1982 5:16 PM, Fold in Truth Copy changes, add Appendix B