Towards an Interchange Standard for Editable Documentsby Jim Mitchell and Jim HorningVersion 1.3/August 31, 1982The Interscript standard will define a digital representation of editable documents forexchange among different editing systems. A script is the representation of a document inthe Interscript format; it can be transmitted from one editor to another over a network, orcan be stored for later editing. A script is not limited to any particular editor: if a scriptcontains editable information some of which is not understandable by a particular editor, it isstill possible to edit the parts of the document understood by that editor without losing orinvalidating the parts it does not understand.This draft is a proposal for the technical content of the Interscript standard. It defines andexplains the proposed standard, gives examples of its use, explains how to externalizedocuments from an editor's private format as scripts, and how to internalize scripts into aneditor's private format. It also indicates a number of issues that must still be resolved toestablish a practical standard.The standard provides for documents witha dominant hierarchical structure (e.g., book/chapter/section/paragraph...) while alsoproviding for documents needing more general structure than a single tree (e.g., forgraphics, for certain kinds of document formatting, or for cross-references in a textualdocument),formatting information (e.g., margins, fonts, line widths, etc.),definitional structure (such as styles or property sheets), andintermixed kinds of editable information (e.g., text with imbedded graphics). This draft deals primarily with the contents of Layers 0 and 1 (the base language) of theproposed standard.Contents1. Introduction2. The Language Basis: Syntax and Semantics3. HigherLevel Issues4. PragmaticsAppendix A: GlossaryWp{6 gLXqX gI  gFrq rD gDZ sr( gB": gA ? g?d2- g=0, g<. g9 R g7f ?s g5rAs r g4A g2p g/h(]-3 L]+5])=](= ]& A]#?]!N g6# g'q g g+ g g g 6 g y>QNTowards an Interchange Standard for Editable Documents11.IntroductionInterscript provides a means of representing editable documents. This representation isindependent of any particular editor and can therefore be used to interchange documentsamong editors.The basis of Interscript is a language for expressing editable documents as scripts.Scripts are created by computer programs (usually an editor or associated program); scriptsare "compiled" by programs to produce whatever private format a particular editor uses torepresent documents.1.1. Rationale for an interchange standardAs office systems proliferate, being able to interchange documents among differentediting systems is becoming more and more important. Customers need documentcompatibility to avoid being trapped in evolutionary cul-de-sacs and having to pay the awfulprice of converting documents from one product's format to another's (even within onecompany's product line sometimes).Now, an editing program typically uses a private, highly-encoded representation fordocuments in order to meet goals of performance and functionality. Generally, this meansthat different editors use different, incompatible private formats, and the user canconveniently edit a document only with the editor used to create it. This problem can besolved by providing programs to convert between one editor's private (or file) format andanother's. However, a set of different editors with N different document representationsrequires N(N-1) conversion routines to be able to convert directly from each format to everyother.This N(N-1) problem can be reduced to 2(N-1) by noticing that we could write N-1conversion routines to go from F1 (format for editor1) to F2,. . .,FN, and another N-1 routinesto convert from F2,. . .,FN to F1. Except when converting from or to F1, this scheme requirestwo conversions to go from Fi to Fj (j=i); this is a minor drawback. Choosing which editorshould be editor1 is a more critical issue, however, since the capabilities of that editor willdetermine how general a class of documents can be interchanged among the editors.This presents a truly difficult problem in the case that there is no single functionallydominant editor. If the pivotal editor1 doesn't incorporate all of the structures, formats, andcontent types used by all of the others, then it will not be possible to faithfully convertdocuments containing them. Even if we had a single editor that was functionally dominant, itwould place an upper bound on the functionality of all future compatible editors. Since thereare no actual candidates for a totally dominant editor, we have chosen instead to examine ingeneral what information editors need and how that information can be organized torepresent general documents.Since we are not proposing an editor, we do not need to design a private format for itsdocuments; we only need an external representation that is capable of conveying the content,form, and structure of editable documents. That external representation has only onepurpose: to enable the interchange of documents among different editors. It must be easy to;frX6Gf _q [r@ YA X T @sr R[ QQ O= JGqrXq% FrA E &D' C@R Au8 ?" <8L :nX 8BPC 6I 5/* 3CH 1y)3 / ,<F *r )t*rr)t*rr)t*rr)t*rr ((t(r(t(r(t(r%(t(r &&Ot&r&Ot&rur- %$t%r; #G<  K  ~t r8 @F u\ @ :" B K = #9  D K s  yrP  2  2>Q]oTowards an Interchange Standard for Editable Documents2convert between real editors' formats and this interchange encoding.Using a standard interchange encoding has the additional advantage that much of theinput and output conversion algorithms will be common to all conforming editors. Forexample, when a new version of an existing editor is released, the only differences in the newversion's conversion routines will be in the areas in which its internal document format haschanged from its previous form; this represents a significant saving of programming.1.2. Properties that any interchange standard must haveAn interchange encoding for editable documents must satisfy a number of constraints.Among these are the following:1.2.1. Universal character setScripts must be encoded using the graphic (printable) subset of the ISO 646 printingcharacter set. As well as the obvious rationale that these characters are guaranteed not tohave control significance to any devices meeting the ISO standard, it has the additionaladvantage that a script is humanly readable.1.2.2. Encoding efficiencySince editable documents may be stored as scripts, may be transmitted over a network,and must certainly be processed to convert them to various editors' private formats, it isimportant that the encoding be reasonably space-efficient.Similarly, the time cost of converting between interchange encoding and private formatsmust be reasonably low, since it will have a significant effect on how useful the interchangestandard is. (If the overheads were small enough, an editor might not even use a private fileformat for document storage.)1.2.3. Open-ended representationScripts must be capable of describing virtually all editable documents, including thosecontaining formatted text, synthetic graphics, scanned images, etc., and mixtures of thesevarious modes. Nor may the standard foreclose future options for documents that exploitadditional media (e.g., audio) or require rich structures (e.g., VLSI circuit diagrams, databaseviews). For the same reasons, the standard must not be tied to particular hardware or to a fileformat: documents will be stored and transmitted using a variety of media; it would be folly totie the representation to any particular medium.1.2.4. Document content and formThe complete description of a document component usually requires more than anenumeration of its explicit contents; e.g., paragraphs have margins, leading between lines,default fonts, etc. Scripts must record the association between attributes (e.g., margins) andfrX6 f g_/sr[)* gY5 gXV gVD? gTyT gOqrXq2Lr; gJG gF$srXsBrG g@E g?B g=S, g90srXs5r9 g3H g2):.: g,? g+")4 g)W g%5srXs!r3$ gH g.E gcP gK gF g0 gsrXsor8 g L g P >Q\x Towards an Interchange Standard for Editable Documents3pieces of content.Both the contents and attributes of typical documents require a rich value spacecontaining scalar numbers, strings, vectors, and record-like constructs in order to describeitems as varied as distances, text, coefficients of curves, graphical constraints, digital audio,scanned images, transistors, etc. 1.2.5. Document structureMany documents have hierarchical structure; e.g., a book is made of chapters containingsections, each of which is a sequence of paragraphs; a figure is embedded in a frame on apage and in turn contains a textual caption and imbedded graphics; and the description of anintegrated circuit has levels corresponding to modular or repeated subcircuits. The standardshould exploit such structure, without imposing any particular hierarchy on all documents. Hierarchy is not sufficient, however. Parts of documents must often be related in otherways; e.g., graphics components must often be related geometrically, which may defyhierarchical structuring, and it must be possible to indicate a reference from some part of adocument to a figure, footnote, or section in way a that cuts across the dominant hierarchy ofthe document (section 1.6.4).Documents often contain structure in the form of indirection. For instance, a set ofparagraphs may all have a common "style," which must be referred to indirectly so thatchanging the style alone is sufficient to change the characteristics of all the paragraphs usingit. Or a document may be incorporated "by reference" as a part of more than one documentand may need to "inherit" many of its properties from the document into which it is beingincorporated at a given time. 1.2.6. Transcription fidelityIt must be possible to convert any document from any editor's private format to a scriptand reconvert it back to the same editor's private format with no observable effect on thedocument's content, form, or structure. This characteristic is called transcription fidelity, andis a sine qua non for an interchange encoding; if it is not possible to accomplish this, theinterchange encoding or the conversion routines (or both) must be defective.1.2.7. Script comprehensionEven complicated documents have simple pieces. A simple editor should be able todisplay parts of documents that it is capable of displaying, even in the presence of parts thatit cannot. More precisely, an editor must, in the course of internalizing a script (converting itfrom a script to its private, editable format), be able to discover all the information necessaryto recognize and to display the parts that it understands. This must work despite the fact thatdifferent editors may well use different data structures to represent the content, form, andstructure of a document.At a minimum, this requires that a script contain information by which an editor can easilydetermine whether or not it understands a component well enough to display or edit it, and;frX6Gf _ [5 YS Xa VD" R"srXs Nr; LW KA IPS GK DN BIN @~4) >*4 < 9w-' 7? 5 T 4@ 2LR 0 ,_srXs (r: '#B %XFsr #s r+ !L srXs /rH d_ +s r W L :> o T  3M   >Q^,Towards an Interchange Standard for Editable Documents4that it be able to interpret the effect that components which it does not understand have onthe ones it does. For example, if an editor does not understand figures, it should still bepossible for it to display their embedded textual captions correctly, even though a figuremight well dictate some of its caption's content or attributes such as margins, font, etc. This constraint requires that an interchange encoding must have a simple syntax andsemantics that can be interpreted readily, even by low-capability editors. Along with thedesire for openendedness (section 1.2.3), this suggests a language with some form of"extension by definition" built around a small core.1.2.8. RegenerationProcessing a script to internalize it correctly is only half the problem. It is equallyimportant that an editor, in externalizing a script from its private document format be able toregenerate the content, form, and structure carried by the script from which the documentoriginally came. In particular, when regenerating a script from an edited document, it shouldbe possible to retain the structure in parts of the original script that were not affected byediting operations. For example, an editor that understands text but not figures should beable to edit the text in a document (although editing a caption may be unsafe withoutunderstanding figures) while faithfully retaining and then regenerating the figures whenexternalizing it.This problem is much less severe when an editor is transcribing a document that it"understands" completely, e.g., because the entire document was generated using thateditor.1.3. What the Interscript standard does not doThere are a number of issues that the Interscript standard specifically does not discuss.Each of these issues is important in its own right, but is separable from the design of aninterchange representation1.3.1. Interscript is not a file formatThe interchange encoding of a script is a sequence of ASCII/ISO 646 characters. Thestandard is not concerned with how that representation is held in files on various media(floppy disks, hard disks, tapes, etc.), or with how it is transmitted over communicationsmedia (Ethernet, telephone lines, etc.). 1.3.2. Interscript is not a standard for editingA script is not intended as a directly editable representation. It is not part of its functionto make editing of various constructs easier, more efficient, or more compact: those are thepurview of editors and their associated private document formats. A script is intended to beinternalized before being edited. This might be done by the editor, by a utility program on theediting workstation, or by a completely separate service.frX6 f g_\ g]KS g[S gYUVD5 gTyY gR< gP4 gLsrXs IPr/( gGs r ' gEs r< gC> gB%&7 g@[@ g>E g<5# g:7%- g5; g3 g.qrXq#vq+r1( g)F g' g#srXs brsr, gX g@ g) gsrXs)nr/0 g%7 g\ g!> g D9 >Q\Towards an Interchange Standard for Editable Documents51.3.3. Combining documents is not an interchange functionThis exclusion is really a corollary of the statement, "A script is not intended as a directlyeditable representation." In general, it is no easier to "glue" two arbitrary documentstogether than it is to edit them.1.3.4. Interscript does not overlap with other standardsThere are a number of standards issues that are closely related to the representation ofeditable documents, but which are not part of the Interscript standard because they are alsoclosely related to other standards. For example, the issues of specifying encodings forcharacters in documents, how fonts should be named or described, or how the printing ofdocuments should be specified (i.e., Interpress) are not part of this work.1.4. Concepts and Guiding Principles1.4.1. LayersThe Interscript standard is presented in layers:Layer 0 defines the syntax of scripts; parsing reveals the dominant structure of thedocuments they represent.Layer 1 defines the semantics of the base language, particularly the treatment ofbindings and environments.Layer 2 defines the semantics of properties and attributes that are expected to have auniform interpretation across all editors.Various Layer 3 extensions will define the semantics of properties and attributes thatare expected to be shared by particular groups of editors.The present document focusses almost exclusively on Layers 0 and 1, although some of theexamples illustrate properties and attributes likely to be defined in Layer 2.1.4.2. Externalization and InternalizationTranscription fidelity requires that any document prepared by any editor can beexternalized as a script that will then be internalized by the editor without loss of information.Ease of internalization requires that the Interscript base language contain only relatively few(and simple) constructs. We resolve this apparent paradox by including within the baselanguage a simple, yet powerful, mechanism for abbreviation and extension.A script may be considered to be a "program" that could be "compiled" to convert thedocument to the private representation of a particular editor, ready for further editing. TheInterscript language has been designed so that internalizing scripts into typical editors'representations can be performed in a single pass over the script by maintaining a few simpledata structures.;frX6Gf _srs2 [r@ YC X! SsrXs1 PzrB N? L#4 KC IPK DZqrXq @7srs QXTowards an Interchange Standard for Editable Documents61.4.3.Content, Form, Value, and StructureMost editors deal with both the content of a document (or piece of a document), and itsform. The former is thought of as "what" is in the document, the latter as "how" it is to beviewed; e.g., "ABC" has a sequence of character codes as its contents; its format mayinclude font and position information. Interscript maintains this distinction.The distinction between the value and the structure of both content and form within adocument is also important. When viewing a document, only the value is of concern, but thestructure that leads to that value may be essential to convenient editing. An example ofstructure in content is the grouping of text into paragraphs; in form, associating a named"style" with a paragraph.Content: Text and graphics are common special cases. Interscript's treatment of thesehas been largely modelled on that of Interpress. Other kinds of content may be representedby structures built from character strings, numbers, Booleans, and identifiers.Form: Interscript provides for open-ended sets of properties and attributes. Properties areassociated with content by means of tags. Attributes are bindings between names and valuesthat apply over some scope (sections 1.4.4.23). The way the contents of a document are tobe "understood" is determined by its properties; Interscript makes it straightforward todetermine what these properties are without having to understand them.Structure: Most editors structure the content of a document somehowinto words,sentences, paragraphs, sections, chapters; or lines, pages, signatures, for example. Thisassists in obtaining private efficiency, but, more importantly, provides a conceptual structurefor the user.Full transcription fidelity requires that the Interscript language be adequate to record anystructure that is maintained by any editor for either form or content. Of course, some editorsprovide a number of different structures. A general structure, of which all the editors weknow use special cases, is the labelled directed graph. Interscript provides this structure,without restricting the purposes for which it may be used. There are also two specializationsof general graphs that occur so frequently that Interscript treats them specially:Sequences: The most important, and most frequent, relationship between values islogical adjacency (sequentiality), which is represented by simply putting them oneafter another in the script.Ordered trees: Most editors that structure contents have a "dominant" hierarchy thatmaps well into trees whose arcs are implicitly labelled by order. (Different editors usethese trees to represent different hierarchies). Interscript provides a simple linearnotation for such trees, delimiting node values by braces ("{" and "}"). If an editormaintains multiple hierarchies, the dominant one is the one transcribed into the treestructure and used to control the inheritance of attributes.Structure for content beyond that contained in the dominant hierarchy is represented byexplicit links in the script; any node may be labelled as the source and/or the target of anynumber of links. A link whose target is a single node uniquely identifies that node; links withmultiple targets may be used to represent sets of nodes.frX6 f g_s]#[r sr0 gYsrX gX wrB gVDORsrsr" gQ+0 gO=3& gMr= gKH6srE gFk1* gDOA.sr-s rs r g?dsrsr g=sr+ g;U g:F6s r0 g4*/ g2U g12 -X g+N g*+*1 g(`M g&-0 g$R]"F ] +']H]4]lA]=]$sr #]vsr)]< p1; gfsr0sr sr gO g 8  >Q[WTowards an Interchange Standard for Editable Documents7Typical structures recorded for form are expressions (indicating intended relations amongattribute values) and sharing (representable by indirection). Interscript allows expressions tobe composed of literals, identifiers, operators, and function applications, and permits the useof identifiers to represent expressions.1.4.4. Features of the Base Language1.4.4.1 ValuesExpressions in a script may denoteLiteral values of primitive typesBooleans: F, TIntegers: . . . 3, 2, 1, 0, 1, 2, 3, . . .Reals: 1.2E5, . . .Strings: Universal names: TEXT, XEROX, PARAGRAPHStructured valuesNodesVectors of valuesEnvironmentsGeneric operationsInvocationsApplicationsSelectionsOperations specific to particular typesArithmeticComparisonLogicalSubscript. . .BindingsLabelsTagsTargetsSourcesLink introductionsExpressions to be evaluated at the point of invocation1.4.4.2 Environments and Attributes;frX6Gf _Y ]KP [ Q Y( UsrXs RE Nr"L!J H|-FkDZBI'@>;9 75 3 1y /D'-3 +" )'$" tcRA 6 sX# w>QUTowards an Interchange Standard for Editable Documents8Environments bind attribute identifiers to values (or expressions denoting values), invarious modes:"_" denotes a local binding, which may be freely superseded,":=" denotes a global binding, which creates or modifies an attribute in theoutermost environment.NULL denotes the "empty" environment, containing bindings for no attributes. The (implicit)outermost environment binds each identifier id to the corresponding universal name ID(written with all capital letters).Each piece of content in a document has its own environment. Editors will use relevantattributes from that environment to control its form.Attributes may also be used in scripts for two structuring purposes:abbreviation: an identifier may be bound to a quoted expression; within the scope ofthe binding, the use of the identifier is equivalent to the use of the full expression;indirection: reference through an identifier permits information (such as styles) to bedefined in one place and shared throughout its scope; this is an example of structure(which must be preserved) in the form of a document.1.4.4.3 InheritanceThe dominant hierarchy of a document is represented by grouping its pieces withinnodes, which are the most obvious form of content structuring. They also control the scopeof bindings.The environment of a node is initially inherited from its containing node (except for theoutermost node, which inherits it from the editor), and may be modified by bindings. Abinding takes effect at the point where it appears, and its scope extends to the end of theinnermost node containing it, with two exceptions:any binding except a definition may be superseded by a (textually) later binding (ifthe later binding is in a nested node, the outer binding's scope will resume at theend of the inner node), anda global binding extends over the all of the document lexically to the right of thebinding.Attributes are inherited only via environments following the dominant structure. Thus the choice of adominant structure to represent scripts from a particular editor will be strongly influenced by expectations aboutinheritance.Attributes are "relevant" to a node if they are assumed by any of its tags. In general, a node's environmentwill also contain bindings for many "latent" attributes that are either relevant to its ancestors (and inherited bydefault) or are potentially relevant to its descendants.The interior of each node is implicitly prefixed by Sub, which will generally be bound in the containingenvironment to a quoted expression performing some bindings, applying some labels, and/or supplying someinitial content.frX6 f_; g]K ][<]X >]W; pTtrH gRDsr gQ#M> gK5HYD]F$+sr]D}).]BIsr +]@5]>4 g;sX89r; g6o1) g4 120) g/hA g-A g+2]){#1]'<]&s]$L]"%t8rt g}Q! g 75 gJ,G g84xt1 gQ g d o>QXTowards an Interchange Standard for Editable Documents91.4.4.4 ExpressionsExpressions involving the four infix operators (+, , *, /) are evaluated right-to-left (a laAPL); since we expect expressions to be short, we have not imposed precedence rules.Parentheses are used to delimit vector values. Square brackets are used to delimit theargument list of an operator application and to denote environment constructors, whichbehave much like records.The notation for selections (conditionals) follows Algol 68:( | | )This is consistent with our principles of using balanced brackets for compound constructionsand avoiding syntactically reserved words; the true part and false part may each contain anarbitrary number of items (including none). 1.4.4.5 Tags and LinksA tag is written as a universal name followed by $''. A tag, U, labels a node that containsit with its associated properties and also invokes the component of the outermostenvironment X with the name U. Tags are either present in a node or absent, whereasattributes have values that apply throughout a scope.Layer 2 of the standard will be primarily concerned with the definition of a (small) set ofstandard properties that are expected to be shared among all conforming editors. For eachstandard property, it will describethe associated tag that denotes it,the assumptions it implies about the contents (values that must/may be present andtheir intended intepretation, invariant relations that are to be maintained, etc.),the assumptions it makes about the environment (attributes that must be present andtheir intended intepretation). Links enable a script to model associations that cut across its dominant structure: a linkset denotes a set of directed arcs from each of its source nodes to all its target nodes.There are several ways this facility can be used:(ST)A link set with a single source node and a single target node models a simplereference from one node in a document to another.(S*T)For a link set with a single target node and multiple source nodes, each sourcenode can be viewed as "pointing to" that target node.(ST*)The symmetrical extreme case of a single source node and multiple target nodescorresponds closely to an entry in an index, which refers to all the places wheresome term is used (section 1.6 contains an example).(S*T*)Finally, multiple source and target nodes in a link set can be used for all thecross references within a document of the form "see sections 1.6, 1.7, 2.3". To use links, a script must declare the "main" identifier of a link set ("LINKS" id) at theroot of a subtree containing all its sources and targets, and textually preceding them. Oncethis main identifier has been introduced, nodes can be labelled as sources for subsets of this;frX6Gf _s [r[ Y I VgB TP R O`<M,' J&6 H#8 F, CsX @7r,yr >m021 < tryr wr :wr 7f)!wr 5P 3#1#/hR-S+G ) &ssr+* $4srsr "1 (1(1 (C (&5 ((&(J@(4 n(3(A  U0srtrsr  A  N  y?Q]7Towards an Interchange Standard for Editable Documents10linkset. For example, the label "id.a.b:" would make a node a target for source nodescontaining references of the sort "^id", "^id.a", or "^id.a.b".1.4.5. Script comprehensionThe Interscript standard applies to interchange among editors with widely varyingcapabilities. It will be important to define some structure to the space of possible scripts, justas Interpress has for printable documents. Dimensions in which we foresee reasonablevariations in script comprehension are:Abbreviations: only editor-supplied  defined in document.Dominant structure: single-layer  arbitrary.Other structure: no links or indirections  links and indirections preserved.Bindings: Local only and global (:=).Selection: No conditionals  conditionals.Numbers: Integers only  floating point.See section 2.4 for further details.1.4.6. Internalizing a ScriptThe private representations of low-capability editors are not generally adequate to providea full-fidelity internalization of every script produced by a high-capability editor. Thus, wheninternalizing a script, some information may not be viewable or editable. The Interscriptlanguage has been designed to simplify value-faithful internalization, even if structure is lost,and content-faithful internalization, even if form is lostor the conversion of form toadditional content to allow it to be examined (and perhaps even edited) by a low capability-editor. The standard provides some simple conditions under which a low-capability editor cansafely modify parts of a document that it understands fully, without thereby destroying thevalue or structure of parts that it is not prepared to deal with.A script may be internalized into an editor's (private or file) representation as follows:Parse the entire script from left to right.As each literal is encountered in the script, convert it to the editor's representation.As each abbreviation (free-standing invocation) is encountered in the script, replace itwith the value to which it is bound in the environment.As each structure is recognized in the script, represent the corresponding structure inthe editor's representation, if possible; if not, use the semantics of Interscript tocompute the value to be internalized.Update the environment whenever a binding is encountered or a scope is exited,according to the semantics of Interscript.Transfer the values of all attributes relevant to each piece of content from the currentenvironment to the editor's representation, if possible; if not, apply an invertiblefunction to convert the attribute-value binding into additional content.Determine the properties of each node from its tags; this list will be complete at theend of the node. A node is viewable if any of its tags denotes a property in the set offrX6 f g_srsr g]K$srsr sr gY)srXsUr/" gSKwr gR"I gPW']N#:]K-]IM]G%]EQ*]C(?$ g;srXs8r= g6KQ g4(1 g2K g0 * - g/!A g-V? g+C g)A&O?]$+]!;]/)] 7]8]/C]%]SN]*]wG]+)])H] *,] Msr4 >Q]!Towards an Interchange Standard for Editable Documents11those the editor is prepared to display; it is understood if they are all in the set ofthose the editor is prepared to edit.Record the sources and targets of all links; for any link, these lists will be complete atthe end of the node in which its main identifier was introduced. Translate each link tothe corresponding editor structure, according to the properties of the node thatintroduces it.Of course, any process yielding an equivalent result is equally acceptable.1.5.Introduction to the Interscript Base LanguageThis section is intended to lead the reader through a set of examples, to show what thelanguage looks like and how it is used to represent a number of commonly occurring featuresof editable documents. The examples purposely use rather long identifiers and lots of whitespace to make them more readable. In actual use, programs, not people, will generate andread scripts; names will tend to be short; and logically unneeded spaces and carriage returnswill tend to be omitted.1.5.1. Simple text as a documentThe following script defines a document consisting of the string "The text of the mainnode of example 1.5.1"; no font, paragraph structure, or formatting information is supplied.This example will gradually be expanded to represent accurately figure 1.5.1, below. Thenumbers at the left margin do not form part of the script; they are used to refer to the variouslines in the discussion below.0Interscript/Interchange/1.01{}2EndScriptLine 0 is the header denoting version 1.0 of the interchange encoding. Line 1 is the entirebody of this script: it contains a single node enclosed in {} which in turn contains a singlestring value enclosed in <>. Line 2, with the keyword "EndScript" marks the end of script.The text of the main node ofexample 1.5.1The text of the first subnode of example 1.5.1Example 1.5.1: A simple documentThe next version of the example adds the tag, TEXT$ to the node. The identifier TEXT iscalled a universal name (or atom), which is indicated by its being composed of all uppercaseletters. Universal names have no definition within the base language (they are expected tobe defined in Layers 2 and 3).0Interscript/Interchange/1.01{TEXT$23}4EndScriptA tag is denoted by placing "$" after a universal name. A node's tags are strictly local;frX6Ff_/s r]n%[:DYOW-#VD SK NqX- K>rW IsH G-. EF DH BI >&srXs :rQ 8B 7: 5U R 3 12z /F. .q +rM *+*sr& (`S%zF# $" {z t r)srzrzr s r; R 0 z w F,  U rzr:  >Q]L*Towards an Interchange Standard for Editable Documents12(they are not inherited by other nodes in the script) and serve as "type information" aboutthe node. The tag TEXT$ labels this node as one that can be viewed as textual data. Tags canalso create implicit indirections; see section 1.6.5.0Interscript/Interchange/1.01{PARAGRAPH$2leftMargin_3.25*inch rightMargin_5.0*inch34}5EndScriptThis example shows how auxiliary information, such as margins, may be associated with anode of a script. The binding leftMargin_3.25*inch adds the attribute leftMargin to the node'senvironment and binds the value of the expression 3.25*inch to it (inch is a value whosedimensions are inches/meters; meters are the standard Interscript units of distance). Thebindings to leftMargin and rightMargin convey the fact that this node has margins for display. Todenote the change in character of the node, we have tagged it as PARAGRAPH instead ofTEXT. Figure 1.5.1 uses these margins for its first line of text. 0Interscript/Interchange/1.01{PARAGRAPH$2leftMargin_3.25*inch rightMargin_5.0*inch34{PARAGRAPH$ leftMargin_+0.5*inch56}7}8EndScriptWe have further elaborated the example by nesting another text node in the primary one,with its text following the primary node's text and with an indented leftMargin. The bindingleftMargin_+0.5*inch is a contraction of leftMargin_leftMargin+0.5*inch. The right side of the binding isevaluated, and since there is as yet no binding in the inner node's (lines 46) environment forleftMargin, it is looked up in the environment of the containing node (lines 13). The value ofthe right hand side expression is thus 3.75*inch. This value is then bound to the identifierleftMargin in the inner node's environment. Since no value is bound to rightMargin in the innernode's environment, it will have the same rightMargin as its parent node.0Interscript/Interchange/1.01p _ 'PARAGRAPH$ leftMargin_3.25*inch rightMargin_6.0*inch'2{p rightMargin_5.0*inch34{p leftMargin_+0.5*inch56}7}8EndScriptOne can also define an abbreviation by binding a sequence of unevaluated expressions toan identifier and subsequently using the identifier to cause those expressions to be evaluatedat the point of invocation. This example binds the quoted expression'PARAGRAPH$leftMargin_3.25*inchrightMargin_6.0*inch' to the identifier p. When p is invoked in lines2 and 4, the quoted expression replaces the invocation and is evaluated there.frX6 f g_-. g]Kzr6t g[5YzPW^P UPF)TP,RPQ+PMrU gLszr z r gJG2zrzr gH|U gF z rz r+ gD:zr gCzr>@zP>P =SPF);P,:8]0651P3P0rR g.#"z r g,zrzr  g+"[ g)Wz rH g'zr, g%z r=z r g#*z r!}zPPF:.PP,80 PP]rs r% g<" g )*s g z6rzrzr g 3N l >Q^tTowards an Interchange Standard for Editable Documents13Invoking p places the tag PARAGRAPH$ on the node, sets the leftMargin to 3.25*inch and therightMargin to 6.0*inch. In line 2, the rightMargin is then rebound to 5.0*inch, overriding the defaultbinding created by invoking p. Similarly, the binding for leftMargin in line 4 overrides the oneresulting from invoking p, resulting in its leftMargin being 3.75*inch and its rightMargin being 6.0*inch. An identifier can also be bound to an environment value as a convenient record-likemanner of naming a set of related bindings. For example, a font might be defined as follows(a more complete definition is given later in section 1.6.3):font _ [ | family_TIMES size_10*pt face_[ | weight_NORMAL style_ROMAN slant_NIL] ] This defines font to be the environment formed by taking the empty or NULL environmentand altering it according to the series of bindings following the initial "[ |." In this case font isan environment having bindings for three attributes, family, size, and face. face is itself bound toan environment (with attributes weight, style, and slant). The set of default bindings in font specifya normal weight (non-bold), non-italic Times Roman 10-point font.We can incorporate this font definition in the example and then use it to indicate that theword "first" in the subnode should be in italics:0Interscript/Interchange/1.01p _ 'PARAGRAPH$ leftMargin_3.25*inch rightMargin_6.0*inch'2font _ [ | family_Times size_10*pt face_[ | weight_NORMAL style_ROMAN slant_NIL] ]3{p rightMargin_5.0*inch45{p leftMargin_+.5*inch67font.face.slant_ITALIC font.face.slant_NIL8< subnode of example 1.5.1>9}10}11EndScriptBindings affect node contents to their right: so, "first" will be italic, while "subnode ofexample 1.5.1" will be non-italic due to the binding immediately preceding it. If we expectedto switch between italics and non-italics frequently, it might be profitable to introduceabbreviations to shorten what must appear. For example, in the scope of the definition l _ [ | i _ 'font.face.slant_ITALIC' nI _ 'font.face.slant_NIL'] line 7 could be abbreviatedl.il.nI;frX6Ff _zrz rz rzr ]Kz rzrz rzr [zr z r Yzrz rzrz rzr VDsr Ty!: R= PWzFS Mr zr/tr K>Kzr zr Is4zrzrzrzr Gzrzrzr zr EA BlI @1 >Jz <F: ;R :' 8, 7fx 6x 4x2 3Cx 1x 0 /! +rM *%8 (=> &sK #XzF@ !Yr z  >QIt-Towards an Interchange Standard for Editable Documents14HISTORY LOGEdited by Mitchell, September 1, 1981 3:12 PM, added first version of glossaryEdited by Mitchell, September 7, 1981 2:11 PM, wrote parts of introductionEdited by Mitchell, September 10, 1981 10:14 AM, added Tab def to Star property sheetsEdited by Mitchell, September 14, 1981 9:54 AM, renumbered chapters and did minor editsEdited by Mitchell, September 16, 1981 8:42 AM, folding in comments from JJH's review and added sections on renditionand transcription fidelityEdited by Mitchell, September 18, 1981 1:56 PM, folded in comments from JJH's reviewEdited by Horning, May 3, 1982 6:02 PM, Folded in comments from Truth copyEdited by Mitchell, May 10, 1982 3:28 PM, changed "Interdoc" to "Interscript", "rendering" to "internalizing", and"transcribing" to "externalizing" plus various edits necessitated by these substitutions.Edited by Mitchell, August 23, 1982 2:45 PM, making final version of this report: eliminated const bindings, changeddiscussion of links; changed examples.frX6 f&](q|q| g[]tFO gYK gXW gVgX gTVvS gQqU gOK gN#sLY gK>uJ&$ I=R.N HELVETICA HELVETICA  HELVETICA  HELVETICA  HELVETICAMATH  HELVETICA  TIMESROMAN  HELVETICA HIPPO   TIMESROMAN  TIMESROMAN  HELVETICA8 %, 5X8@HPW_fjj/mkEinterscript-0.bravomitchellAugust 31, 1982 4:27 PM