LIMITED DISTRIBUTION: FOR XEROX INTERNAL USETowards an Interchange Standardfor Editable Documentsby Jim Mitchell (Mitchell.PA) and Jim Horning (Horning.PA)Version 1.3/August 31, 1982The Interscript standard will define a digital representation of editable documents forexchange among different editing systems. A script is the representation of a document inthe Interscript format; it can be transmitted from one editor to another over a network, orcan be stored for later editing. A script is not limited to any particular editor: if a scriptcontains editable information some of which is not understandable by a particular editor, it isstill possible to edit the parts of the document understood by that editor without losing orinvalidating the parts it does not understand.This draft is a proposal for the technical content of the Interscript standard. It defines andexplains the proposed standard, gives examples of its use, explains how to externalizedocuments from an editor's private format as scripts, and how to internalize scripts into aneditor's private format. It also indicates a number of issues that must still be resolved toestablish a practical standard.Note:This draft is being circulated to interested parties within Xeroxto report preliminary ideas. It should not be interpreted as a definitiveproposal, and should not be distributed outside.XEROXPALO ALTO RESEARCH CENTERCOMPUTER SCIENCE LABORATORY3333 Coyote Hill Road / Palo Alto / California 94304 ge0f@p{,+W;qTy gLXrX: gI  gFsr sD gDZ ts( gB": gA ? g?d2- g=0, g<. g9 R g7f ?t g5sAt s g4A g2pS/hrIs4 S--S,0(`u(`r(`S(`v! ge>QSTowards an Interchange Standard for Editable Documentsby Jim Mitchell and Jim HorningVersion 1.3/August 31, 1982The Interscript standard will define a digital representation of editable documents forexchange among different editing systems. A script is the representation of a document inthe Interscript format; it can be transmitted from one editor to another over a network, orcan be stored for later editing. A script is not limited to any particular editor: if a scriptcontains editable information some of which is not understandable by a particular editor, it isstill possible to edit the parts of the document understood by that editor without losing orinvalidating the parts it does not understand.This draft is a proposal for the technical content of the Interscript standard. It defines andexplains the proposed standard, gives examples of its use, explains how to externalizedocuments from an editor's private format as scripts, and how to internalize scripts into aneditor's private format. It also indicates a number of issues that must still be resolved toestablish a practical standard.The standard provides for documents witha dominant hierarchical structure (e.g., book/chapter/section/paragraph...) while alsoproviding for documents needing more general structure than a single tree (e.g., forgraphics, for certain kinds of document formatting, or for cross-references in a textualdocument),formatting information (e.g., margins, fonts, line widths, etc.),definitional structure (such as styles or property sheets), andintermixed kinds of editable information (e.g., text with imbedded graphics). This draft deals primarily with the contents of Layers 0 and 1 (the base language) of theproposed standard.Contents1. Introduction2. The Language Basis: Syntax and Semantics3. HigherLevel Issues4. PragmaticsAppendix A: GlossaryWp{6 gLXqX gI  gFrq rD gDZ sr( gB": gA ? g?d2- g=0, g<. g9 R g7f ?s g5rAs r g4A g2p g/h(]-3 L]+5])=](= ]& A]#?]!N g6# g'q g g+ g g g 6 g y>QNTowards an Interchange Standard for Editable Documents11.IntroductionInterscript provides a means of representing editable documents. This representation isindependent of any particular editor and can therefore be used to interchange documentsamong editors.The basis of Interscript is a language for expressing editable documents as scripts.Scripts are created by computer programs (usually an editor or associated program); scriptsare "compiled" by programs to produce whatever private format a particular editor uses torepresent documents.1.1. Rationale for an interchange standardAs office systems proliferate, being able to interchange documents among differentediting systems is becoming more and more important. Customers need documentcompatibility to avoid being trapped in evolutionary cul-de-sacs and having to pay the awfulprice of converting documents from one product's format to another's (even within onecompany's product line sometimes).Now, an editing program typically uses a private, highly-encoded representation fordocuments in order to meet goals of performance and functionality. Generally, this meansthat different editors use different, incompatible private formats, and the user canconveniently edit a document only with the editor used to create it. This problem can besolved by providing programs to convert between one editor's private (or file) format andanother's. However, a set of different editors with N different document representationsrequires N(N-1) conversion routines to be able to convert directly from each format to everyother.This N(N-1) problem can be reduced to 2(N-1) by noticing that we could write N-1conversion routines to go from F1 (format for editor1) to F2,. . .,FN, and another N-1 routinesto convert from F2,. . .,FN to F1. Except when converting from or to F1, this scheme requirestwo conversions to go from Fi to Fj (j=i); this is a minor drawback. Choosing which editorshould be editor1 is a more critical issue, however, since the capabilities of that editor willdetermine how general a class of documents can be interchanged among the editors.This presents a truly difficult problem in the case that there is no single functionallydominant editor. If the pivotal editor1 doesn't incorporate all of the structures, formats, andcontent types used by all of the others, then it will not be possible to faithfully convertdocuments containing them. Even if we had a single editor that was functionally dominant, itwould place an upper bound on the functionality of all future compatible editors. Since thereare no actual candidates for a totally dominant editor, we have chosen instead to examine ingeneral what information editors need and how that information can be organized torepresent general documents.Since we are not proposing an editor, we do not need to design a private format for itsdocuments; we only need an external representation that is capable of conveying the content,form, and structure of editable documents. That external representation has only onepurpose: to enable the interchange of documents among different editors. It must be easy to;frX6Gf _q [r@ YA X T @sr R[ QQ O= JGqrXq% FrA E &D' C@R Au8 ?" <8L :nX 8BPC 6I 5/* 3CH 1y)3 / ,<F *r )t*rr)t*rr)t*rr)t*rr ((t(r(t(r(t(r%(t(r &&Ot&r&Ot&rur- %$t%r; #G<  K  ~t r8 @F u\ @ :" B K = #9  D K s  yrP  2  2>Q]oTowards an Interchange Standard for Editable Documents2convert between real editors' formats and this interchange encoding.Using a standard interchange encoding has the additional advantage that much of theinput and output conversion algorithms will be common to all conforming editors. Forexample, when a new version of an existing editor is released, the only differences in the newversion's conversion routines will be in the areas in which its internal document format haschanged from its previous form; this represents a significant saving of programming.1.2. Properties that any interchange standard must haveAn interchange encoding for editable documents must satisfy a number of constraints.Among these are the following:1.2.1. Universal character setScripts must be encoded using the graphic (printable) subset of the ISO 646 printingcharacter set. As well as the obvious rationale that these characters are guaranteed not tohave control significance to any devices meeting the ISO standard, it has the additionaladvantage that a script is humanly readable.1.2.2. Encoding efficiencySince editable documents may be stored as scripts, may be transmitted over a network,and must certainly be processed to convert them to various editors' private formats, it isimportant that the encoding be reasonably space-efficient.Similarly, the time cost of converting between interchange encoding and private formatsmust be reasonably low, since it will have a significant effect on how useful the interchangestandard is. (If the overheads were small enough, an editor might not even use a private fileformat for document storage.)1.2.3. Open-ended representationScripts must be capable of describing virtually all editable documents, including thosecontaining formatted text, synthetic graphics, scanned images, etc., and mixtures of thesevarious modes. Nor may the standard foreclose future options for documents that exploitadditional media (e.g., audio) or require rich structures (e.g., VLSI circuit diagrams, databaseviews). For the same reasons, the standard must not be tied to particular hardware or to a fileformat: documents will be stored and transmitted using a variety of media; it would be folly totie the representation to any particular medium.1.2.4. Document content and formThe complete description of a document component usually requires more than anenumeration of its explicit contents; e.g., paragraphs have margins, leading between lines,default fonts, etc. Scripts must record the association between attributes (e.g., margins) andfrX6 f g_/sr[)* gY5 gXV gVD? gTyT gOqrXq2Lr; gJG gF$srXsBrG g@E g?B g=S, g90srXs5r9 g3H g2):.: g,? g+")4 g)W g%5srXs!r3$ gH g.E gcP gK gF g0 gsrXsor8 g L g P >Q\x Towards an Interchange Standard for Editable Documents3pieces of content.Both the contents and attributes of typical documents require a rich value spacecontaining scalar numbers, strings, vectors, and record-like constructs in order to describeitems as varied as distances, text, coefficients of curves, graphical constraints, digital audio,scanned images, transistors, etc. 1.2.5. Document structureMany documents have hierarchical structure; e.g., a book is made of chapters containingsections, each of which is a sequence of paragraphs; a figure is embedded in a frame on apage and in turn contains a textual caption and imbedded graphics; and the description of anintegrated circuit has levels corresponding to modular or repeated subcircuits. The standardshould exploit such structure, without imposing any particular hierarchy on all documents. Hierarchy is not sufficient, however. Parts of documents must often be related in otherways; e.g., graphics components must often be related geometrically, which may defyhierarchical structuring, and it must be possible to indicate a reference from some part of adocument to a figure, footnote, or section in way a that cuts across the dominant hierarchy ofthe document (section 1.6.4).Documents often contain structure in the form of indirection. For instance, a set ofparagraphs may all have a common "style," which must be referred to indirectly so thatchanging the style alone is sufficient to change the characteristics of all the paragraphs usingit. Or a document may be incorporated "by reference" as a part of more than one documentand may need to "inherit" many of its properties from the document into which it is beingincorporated at a given time. 1.2.6. Transcription fidelityIt must be possible to convert any document from any editor's private format to a scriptand reconvert it back to the same editor's private format with no observable effect on thedocument's content, form, or structure. This characteristic is called transcription fidelity, andis a sine qua non for an interchange encoding; if it is not possible to accomplish this, theinterchange encoding or the conversion routines (or both) must be defective.1.2.7. Script comprehensionEven complicated documents have simple pieces. A simple editor should be able todisplay parts of documents that it is capable of displaying, even in the presence of parts thatit cannot. More precisely, an editor must, in the course of internalizing a script (converting itfrom a script to its private, editable format), be able to discover all the information necessaryto recognize and to display the parts that it understands. This must work despite the fact thatdifferent editors may well use different data structures to represent the content, form, andstructure of a document.At a minimum, this requires that a script contain information by which an editor can easilydetermine whether or not it understands a component well enough to display or edit it, and;frX6Gf _ [5 YS Xa VD" R"srXs Nr; LW KA IPS GK DN BIN @~4) >*4 < 9w-' 7? 5 T 4@ 2LR 0 ,_srXs (r: '#B %XFsr #s r+ !L srXs /rH d_ +s r W L :> o T  3M   >Q^,Towards an Interchange Standard for Editable Documents4that it be able to interpret the effect that components which it does not understand have onthe ones it does. For example, if an editor does not understand figures, it should still bepossible for it to display their embedded textual captions correctly, even though a figuremight well dictate some of its caption's content or attributes such as margins, font, etc. This constraint requires that an interchange encoding must have a simple syntax andsemantics that can be interpreted readily, even by low-capability editors. Along with thedesire for openendedness (section 1.2.3), this suggests a language with some form of"extension by definition" built around a small core.1.2.8. RegenerationProcessing a script to internalize it correctly is only half the problem. It is equallyimportant that an editor, in externalizing a script from its private document format be able toregenerate the content, form, and structure carried by the script from which the documentoriginally came. In particular, when regenerating a script from an edited document, it shouldbe possible to retain the structure in parts of the original script that were not affected byediting operations. For example, an editor that understands text but not figures should beable to edit the text in a document (although editing a caption may be unsafe withoutunderstanding figures) while faithfully retaining and then regenerating the figures whenexternalizing it.This problem is much less severe when an editor is transcribing a document that it"understands" completely, e.g., because the entire document was generated using thateditor.1.3. What the Interscript standard does not doThere are a number of issues that the Interscript standard specifically does not discuss.Each of these issues is important in its own right, but is separable from the design of aninterchange representation1.3.1. Interscript is not a file formatThe interchange encoding of a script is a sequence of ASCII/ISO 646 characters. Thestandard is not concerned with how that representation is held in files on various media(floppy disks, hard disks, tapes, etc.), or with how it is transmitted over communicationsmedia (Ethernet, telephone lines, etc.). 1.3.2. Interscript is not a standard for editingA script is not intended as a directly editable representation. It is not part of its functionto make editing of various constructs easier, more efficient, or more compact: those are thepurview of editors and their associated private document formats. A script is intended to beinternalized before being edited. This might be done by the editor, by a utility program on theediting workstation, or by a completely separate service.frX6 f g_\ g]KS g[S gYUVD5 gTyY gR< gP4 gLsrXs IPr/( gGs r ' gEs r< gC> gB%&7 g@[@ g>E g<5# g:7%- g5; g3 g.qrXq#vq+r1( g)F g' g#srXs brsr, gX g@ g) gsrXs)nr/0 g%7 g\ g!> g D9 >Q\Towards an Interchange Standard for Editable Documents51.3.3. Combining documents is not an interchange functionThis exclusion is really a corollary of the statement, "A script is not intended as a directlyeditable representation." In general, it is no easier to "glue" two arbitrary documentstogether than it is to edit them.1.3.4. Interscript does not overlap with other standardsThere are a number of standards issues that are closely related to the representation ofeditable documents, but which are not part of the Interscript standard because they are alsoclosely related to other standards. For example, the issues of specifying encodings forcharacters in documents, how fonts should be named or described, or how the printing ofdocuments should be specified (i.e., Interpress) are not part of this work.1.4. Concepts and Guiding Principles1.4.1. LayersThe Interscript standard is presented in layers:Layer 0 defines the syntax of scripts; parsing reveals the dominant structure of thedocuments they represent.Layer 1 defines the semantics of the base language, particularly the treatment ofbindings and environments.Layer 2 defines the semantics of properties and attributes that are expected to have auniform interpretation across all editors.Various Layer 3 extensions will define the semantics of properties and attributes thatare expected to be shared by particular groups of editors.The present document focusses almost exclusively on Layers 0 and 1, although some of theexamples illustrate properties and attributes likely to be defined in Layer 2.1.4.2. Externalization and InternalizationTranscription fidelity requires that any document prepared by any editor can beexternalized as a script that will then be internalized by the editor without loss of information.Ease of internalization requires that the Interscript base language contain only relatively few(and simple) constructs. We resolve this apparent paradox by including within the baselanguage a simple, yet powerful, mechanism for abbreviation and extension.A script may be considered to be a "program" that could be "compiled" to convert thedocument to the private representation of a particular editor, ready for further editing. TheInterscript language has been designed so that internalizing scripts into typical editors'representations can be performed in a single pass over the script by maintaining a few simpledata structures.;frX6Gf _srs2 [r@ YC X! SsrXs1 PzrB N? L#4 KC IPK DZqrXq @7srs QXTowards an Interchange Standard for Editable Documents61.4.3.Content, Form, Value, and StructureMost editors deal with both the content of a document (or piece of a document), and itsform. The former is thought of as "what" is in the document, the latter as "how" it is to beviewed; e.g., "ABC" has a sequence of character codes as its contents; its format mayinclude font and position information. Interscript maintains this distinction.The distinction between the value and the structure of both content and form within adocument is also important. When viewing a document, only the value is of concern, but thestructure that leads to that value may be essential to convenient editing. An example ofstructure in content is the grouping of text into paragraphs; in form, associating a named"style" with a paragraph.Content: Text and graphics are common special cases. Interscript's treatment of thesehas been largely modelled on that of Interpress. Other kinds of content may be representedby structures built from character strings, numbers, Booleans, and identifiers.Form: Interscript provides for open-ended sets of properties and attributes. Properties areassociated with content by means of tags. Attributes are bindings between names and valuesthat apply over some scope (sections 1.4.4.23). The way the contents of a document are tobe "understood" is determined by its properties; Interscript makes it straightforward todetermine what these properties are without having to understand them.Structure: Most editors structure the content of a document somehowinto words,sentences, paragraphs, sections, chapters; or lines, pages, signatures, for example. Thisassists in obtaining private efficiency, but, more importantly, provides a conceptual structurefor the user.Full transcription fidelity requires that the Interscript language be adequate to record anystructure that is maintained by any editor for either form or content. Of course, some editorsprovide a number of different structures. A general structure, of which all the editors weknow use special cases, is the labelled directed graph. Interscript provides this structure,without restricting the purposes for which it may be used. There are also two specializationsof general graphs that occur so frequently that Interscript treats them specially:Sequences: The most important, and most frequent, relationship between values islogical adjacency (sequentiality), which is represented by simply putting them oneafter another in the script.Ordered trees: Most editors that structure contents have a "dominant" hierarchy thatmaps well into trees whose arcs are implicitly labelled by order. (Different editors usethese trees to represent different hierarchies). Interscript provides a simple linearnotation for such trees, delimiting node values by braces ("{" and "}"). If an editormaintains multiple hierarchies, the dominant one is the one transcribed into the treestructure and used to control the inheritance of attributes.Structure for content beyond that contained in the dominant hierarchy is represented byexplicit links in the script; any node may be labelled as the source and/or the target of anynumber of links. A link whose target is a single node uniquely identifies that node; links withmultiple targets may be used to represent sets of nodes.frX6 f g_s]#[r sr0 gYsrX gX wrB gVDORsrsr" gQ+0 gO=3& gMr= gKH6srE gFk1* gDOA.sr-s rs r g?dsrsr g=sr+ g;U g:F6s r0 g4*/ g2U g12 -X g+N g*+*1 g(`M g&-0 g$R]"F ] +']H]4]lA]=]$sr #]vsr)]< p1; gfsr0sr sr gO g 8  >Q[WTowards an Interchange Standard for Editable Documents7Typical structures recorded for form are expressions (indicating intended relations amongattribute values) and sharing (representable by indirection). Interscript allows expressions tobe composed of literals, identifiers, operators, and function applications, and permits the useof identifiers to represent expressions.1.4.4. Features of the Base Language1.4.4.1 ValuesExpressions in a script may denoteLiteral values of primitive typesBooleans: F, TIntegers: . . . 3, 2, 1, 0, 1, 2, 3, . . .Reals: 1.2E5, . . .Strings: Universal names: TEXT, XEROX, PARAGRAPHStructured valuesNodesVectors of valuesEnvironmentsGeneric operationsInvocationsApplicationsSelectionsOperations specific to particular typesArithmeticComparisonLogicalSubscript. . .BindingsLabelsTagsTargetsSourcesLink introductionsExpressions to be evaluated at the point of invocation1.4.4.2 Environments and Attributes;frX6Gf _Y ]KP [ Q Y( UsrXs RE Nr"L!J H|-FkDZBI'@>;9 75 3 1y /D'-3 +" )'$" tcRA 6 sX# w>QUTowards an Interchange Standard for Editable Documents8Environments bind attribute identifiers to values (or expressions denoting values), invarious modes:"_" denotes a local binding, which may be freely superseded,":=" denotes a global binding, which creates or modifies an attribute in theoutermost environment.NULL denotes the "empty" environment, containing bindings for no attributes. The (implicit)outermost environment binds each identifier id to the corresponding universal name ID(written with all capital letters).Each piece of content in a document has its own environment. Editors will use relevantattributes from that environment to control its form.Attributes may also be used in scripts for two structuring purposes:abbreviation: an identifier may be bound to a quoted expression; within the scope ofthe binding, the use of the identifier is equivalent to the use of the full expression;indirection: reference through an identifier permits information (such as styles) to bedefined in one place and shared throughout its scope; this is an example of structure(which must be preserved) in the form of a document.1.4.4.3 InheritanceThe dominant hierarchy of a document is represented by grouping its pieces withinnodes, which are the most obvious form of content structuring. They also control the scopeof bindings.The environment of a node is initially inherited from its containing node (except for theoutermost node, which inherits it from the editor), and may be modified by bindings. Abinding takes effect at the point where it appears, and its scope extends to the end of theinnermost node containing it, with two exceptions:any binding except a definition may be superseded by a (textually) later binding (ifthe later binding is in a nested node, the outer binding's scope will resume at theend of the inner node), anda global binding extends over the all of the document lexically to the right of thebinding.Attributes are inherited only via environments following the dominant structure. Thus the choice of adominant structure to represent scripts from a particular editor will be strongly influenced by expectations aboutinheritance.Attributes are "relevant" to a node if they are assumed by any of its tags. In general, a node's environmentwill also contain bindings for many "latent" attributes that are either relevant to its ancestors (and inherited bydefault) or are potentially relevant to its descendants.The interior of each node is implicitly prefixed by Sub, which will generally be bound in the containingenvironment to a quoted expression performing some bindings, applying some labels, and/or supplying someinitial content.frX6 f_; g]K ][<]X >]W; pTtrH gRDsr gQ#M> gK5HYD]F$+sr]D}).]BIsr +]@5]>4 g;sX89r; g6o1) g4 120) g/hA g-A g+2]){#1]'<]&s]$L]"%t8rt g}Q! g 75 gJ,G g84xt1 gQ g d o>QXTowards an Interchange Standard for Editable Documents91.4.4.4 ExpressionsExpressions involving the four infix operators (+, , *, /) are evaluated right-to-left (a laAPL); since we expect expressions to be short, we have not imposed precedence rules.Parentheses are used to delimit vector values. Square brackets are used to delimit theargument list of an operator application and to denote environment constructors, whichbehave much like records.The notation for selections (conditionals) follows Algol 68:( | | )This is consistent with our principles of using balanced brackets for compound constructionsand avoiding syntactically reserved words; the true part and false part may each contain anarbitrary number of items (including none). 1.4.4.5 Tags and LinksA tag is written as a universal name followed by $''. A tag, U, labels a node that containsit with its associated properties and also invokes the component of the outermostenvironment X with the name U. Tags are either present in a node or absent, whereasattributes have values that apply throughout a scope.Layer 2 of the standard will be primarily concerned with the definition of a (small) set ofstandard properties that are expected to be shared among all conforming editors. For eachstandard property, it will describethe associated tag that denotes it,the assumptions it implies about the contents (values that must/may be present andtheir intended intepretation, invariant relations that are to be maintained, etc.),the assumptions it makes about the environment (attributes that must be present andtheir intended intepretation). Links enable a script to model associations that cut across its dominant structure: a linkset denotes a set of directed arcs from each of its source nodes to all its target nodes.There are several ways this facility can be used:(ST)A link set with a single source node and a single target node models a simplereference from one node in a document to another.(S*T)For a link set with a single target node and multiple source nodes, each sourcenode can be viewed as "pointing to" that target node.(ST*)The symmetrical extreme case of a single source node and multiple target nodescorresponds closely to an entry in an index, which refers to all the places wheresome term is used (section 1.6 contains an example).(S*T*)Finally, multiple source and target nodes in a link set can be used for all thecross references within a document of the form "see sections 1.6, 1.7, 2.3". To use links, a script must declare the "main" identifier of a link set ("LINKS" id) at theroot of a subtree containing all its sources and targets, and textually preceding them. Oncethis main identifier has been introduced, nodes can be labelled as sources for subsets of this;frX6Gf _s [r[ Y I VgB TP R O`<M,' J&6 H#8 F, CsX @7r,yr >m021 < tryr wr :wr 7f)!wr 5P 3#1#/hR-S+G ) &ssr+* $4srsr "1 (1(1 (C (&5 ((&(J@(4 n(3(A  U0srtrsr  A  N  y?Q]7Towards an Interchange Standard for Editable Documents10linkset. For example, the label "id.a.b:" would make a node a target for source nodescontaining references of the sort "^id", "^id.a", or "^id.a.b".1.4.5. Script comprehensionThe Interscript standard applies to interchange among editors with widely varyingcapabilities. It will be important to define some structure to the space of possible scripts, justas Interpress has for printable documents. Dimensions in which we foresee reasonablevariations in script comprehension are:Abbreviations: only editor-supplied  defined in document.Dominant structure: single-layer  arbitrary.Other structure: no links or indirections  links and indirections preserved.Bindings: Local only and global (:=).Selection: No conditionals  conditionals.Numbers: Integers only  floating point.See section 2.4 for further details.1.4.6. Internalizing a ScriptThe private representations of low-capability editors are not generally adequate to providea full-fidelity internalization of every script produced by a high-capability editor. Thus, wheninternalizing a script, some information may not be viewable or editable. The Interscriptlanguage has been designed to simplify value-faithful internalization, even if structure is lost,and content-faithful internalization, even if form is lostor the conversion of form toadditional content to allow it to be examined (and perhaps even edited) by a low capability-editor. The standard provides some simple conditions under which a low-capability editor cansafely modify parts of a document that it understands fully, without thereby destroying thevalue or structure of parts that it is not prepared to deal with.A script may be internalized into an editor's (private or file) representation as follows:Parse the entire script from left to right.As each literal is encountered in the script, convert it to the editor's representation.As each abbreviation (free-standing invocation) is encountered in the script, replace itwith the value to which it is bound in the environment.As each structure is recognized in the script, represent the corresponding structure inthe editor's representation, if possible; if not, use the semantics of Interscript tocompute the value to be internalized.Update the environment whenever a binding is encountered or a scope is exited,according to the semantics of Interscript.Transfer the values of all attributes relevant to each piece of content from the currentenvironment to the editor's representation, if possible; if not, apply an invertiblefunction to convert the attribute-value binding into additional content.Determine the properties of each node from its tags; this list will be complete at theend of the node. A node is viewable if any of its tags denotes a property in the set offrX6 f g_srsr g]K$srsr sr gY)srXsUr/" gSKwr gR"I gPW']N#:]K-]IM]G%]EQ*]C(?$ g;srXs8r= g6KQ g4(1 g2K g0 * - g/!A g-V? g+C g)A&O?]$+]!;]/)] 7]8]/C]%]SN]*]wG]+)])H] *,] Msr4 >Q]!Towards an Interchange Standard for Editable Documents11those the editor is prepared to display; it is understood if they are all in the set ofthose the editor is prepared to edit.Record the sources and targets of all links; for any link, these lists will be complete atthe end of the node in which its main identifier was introduced. Translate each link tothe corresponding editor structure, according to the properties of the node thatintroduces it.Of course, any process yielding an equivalent result is equally acceptable.1.5.Introduction to the Interscript Base LanguageThis section is intended to lead the reader through a set of examples, to show what thelanguage looks like and how it is used to represent a number of commonly occurring featuresof editable documents. The examples purposely use rather long identifiers and lots of whitespace to make them more readable. In actual use, programs, not people, will generate andread scripts; names will tend to be short; and logically unneeded spaces and carriage returnswill tend to be omitted.1.5.1. Simple text as a documentThe following script defines a document consisting of the string "The text of the mainnode of example 1.5.1"; no font, paragraph structure, or formatting information is supplied.This example will gradually be expanded to represent accurately figure 1.5.1, below. Thenumbers at the left margin do not form part of the script; they are used to refer to the variouslines in the discussion below.0Interscript/Interchange/1.01{}2EndScriptLine 0 is the header denoting version 1.0 of the interchange encoding. Line 1 is the entirebody of this script: it contains a single node enclosed in {} which in turn contains a singlestring value enclosed in <>. Line 2, with the keyword "EndScript" marks the end of script.The text of the main node ofexample 1.5.1The text of the first subnode of example 1.5.1Example 1.5.1: A simple documentThe next version of the example adds the tag, TEXT$ to the node. The identifier TEXT iscalled a universal name (or atom), which is indicated by its being composed of all uppercaseletters. Universal names have no definition within the base language (they are expected tobe defined in Layers 2 and 3).0Interscript/Interchange/1.01{TEXT$23}4EndScriptA tag is denoted by placing "$" after a universal name. A node's tags are strictly local;frX6Ff_/s r]n%[:DYOW-#VD SK NqX- K>rW IsH G-. EF DH BI >&srXs :rQ 8B 7: 5U R 3 12z /F. .q +rM *+*sr& (`S%zF# $" {z t r)srzrzr s r; R 0 z w F,  U rzr:  >Q]L*Towards an Interchange Standard for Editable Documents12(they are not inherited by other nodes in the script) and serve as "type information" aboutthe node. The tag TEXT$ labels this node as one that can be viewed as textual data. Tags canalso create implicit indirections; see section 1.6.5.0Interscript/Interchange/1.01{PARAGRAPH$2leftMargin_3.25*inch rightMargin_5.0*inch34}5EndScriptThis example shows how auxiliary information, such as margins, may be associated with anode of a script. The binding leftMargin_3.25*inch adds the attribute leftMargin to the node'senvironment and binds the value of the expression 3.25*inch to it (inch is a value whosedimensions are inches/meters; meters are the standard Interscript units of distance). Thebindings to leftMargin and rightMargin convey the fact that this node has margins for display. Todenote the change in character of the node, we have tagged it as PARAGRAPH instead ofTEXT. Figure 1.5.1 uses these margins for its first line of text. 0Interscript/Interchange/1.01{PARAGRAPH$2leftMargin_3.25*inch rightMargin_5.0*inch34{PARAGRAPH$ leftMargin_+0.5*inch56}7}8EndScriptWe have further elaborated the example by nesting another text node in the primary one,with its text following the primary node's text and with an indented leftMargin. The bindingleftMargin_+0.5*inch is a contraction of leftMargin_leftMargin+0.5*inch. The right side of the binding isevaluated, and since there is as yet no binding in the inner node's (lines 46) environment forleftMargin, it is looked up in the environment of the containing node (lines 13). The value ofthe right hand side expression is thus 3.75*inch. This value is then bound to the identifierleftMargin in the inner node's environment. Since no value is bound to rightMargin in the innernode's environment, it will have the same rightMargin as its parent node.0Interscript/Interchange/1.01p _ 'PARAGRAPH$ leftMargin_3.25*inch rightMargin_6.0*inch'2{p rightMargin_5.0*inch34{p leftMargin_+0.5*inch56}7}8EndScriptOne can also define an abbreviation by binding a sequence of unevaluated expressions toan identifier and subsequently using the identifier to cause those expressions to be evaluatedat the point of invocation. This example binds the quoted expression'PARAGRAPH$leftMargin_3.25*inchrightMargin_6.0*inch' to the identifier p. When p is invoked in lines2 and 4, the quoted expression replaces the invocation and is evaluated there.frX6 f g_-. g]Kzr6t g[5YzPW^P UPF)TP,RPQ+PMrU gLszr z r gJG2zrzr gH|U gF z rz r+ gD:zr gCzr>@zP>P =SPF);P,:8]0651P3P0rR g.#"z r g,zrzr  g+"[ g)Wz rH g'zr, g%z r=z r g#*z r!}zPPF:.PP,80 PP]rs r% g<" g )*s g z6rzrzr g 3N l >Q^tTowards an Interchange Standard for Editable Documents13Invoking p places the tag PARAGRAPH$ on the node, sets the leftMargin to 3.25*inch and therightMargin to 6.0*inch. In line 2, the rightMargin is then rebound to 5.0*inch, overriding the defaultbinding created by invoking p. Similarly, the binding for leftMargin in line 4 overrides the oneresulting from invoking p, resulting in its leftMargin being 3.75*inch and its rightMargin being 6.0*inch. An identifier can also be bound to an environment value as a convenient record-likemanner of naming a set of related bindings. For example, a font might be defined as follows(a more complete definition is given later in section 1.6.3):font _ [ | family_TIMES size_10*pt face_[ | weight_NORMAL style_ROMAN slant_NIL] ] This defines font to be the environment formed by taking the empty or NULL environmentand altering it according to the series of bindings following the initial "[ |." In this case font isan environment having bindings for three attributes, family, size, and face. face is itself bound toan environment (with attributes weight, style, and slant). The set of default bindings in font specifya normal weight (non-bold), non-italic Times Roman 10-point font.We can incorporate this font definition in the example and then use it to indicate that theword "first" in the subnode should be in italics:0Interscript/Interchange/1.01p _ 'PARAGRAPH$ leftMargin_3.25*inch rightMargin_6.0*inch'2font _ [ | family_Times size_10*pt face_[ | weight_NORMAL style_ROMAN slant_NIL] ]3{p rightMargin_5.0*inch45{p leftMargin_+.5*inch67font.face.slant_ITALIC font.face.slant_NIL8< subnode of example 1.5.1>9}10}11EndScriptBindings affect node contents to their right: so, "first" will be italic, while "subnode ofexample 1.5.1" will be non-italic due to the binding immediately preceding it. If we expectedto switch between italics and non-italics frequently, it might be profitable to introduceabbreviations to shorten what must appear. For example, in the scope of the definition l _ [ | i _ 'font.face.slant_ITALIC' nI _ 'font.face.slant_NIL'] line 7 could be abbreviatedl.il.nI;frX6Ff _zrz rz rzr ]Kz rzrz rzr [zr z r Yzrz rzrz rzr VDsr Ty!: R= PWzFS Mr zr/tr K>Kzr zr Is4zrzrzrzr Gzrzrzr zr EA BlI @1 >Jz <F: ;R :' 8, 7fx 6x 4x2 3Cx 1x 0 /! +rM *%8 (=> &sK #XzF@ !Yr z  >QIt-Towards an Interchange Standard for Editable Documents141.6.Further ExamplesThis section gives some more realistic examples of the use of the Interscript languageand explores the issues of making sets of standard definitions for use in scripts.1.6.1. A Laurel MessageHere is a possible Interscript transcription of a Laurel message:0Interscript/Interchange/1.0-- standard heading --1{LAURELMSG$-- tag for a Laurel document --2Sub _ 'PARAGRAPH$ leftMargin_1.0*inch rightMargin_7.5*inch' --standard node prelude for nodes below--3justified_F4font.family_TIMES font.size_105leading.x_16leading.y_1-- overridable default leadings --7LINKS heading -- declare main identifier of link set --8laurelInfo _-- Laurel information for easy access --9(^Heading.time ^Heading.from ^Heading.subject ^Heading.to ^Heading.cc)10{ {Heading.time: <18 June 1981 9:18 am PDT (Thursday)>}11 {Heading.from: AUTHENTICATED$}12 {Heading.subject: }13 {Heading.to: }14 {Heading.cc: }}15leading.y_6-- override outer y leading --16{}-- node which is a paragraph --17{}18{}19} EndScriptLine 1 tags this document (by tagging its root node) as a Laurel message, and line 2 tags itssubnodes (starting on lines 10, 16, 17, and 18) as paragraphs with default margins. Lines 36bind some other attributes, likely to be relevant to paragraphs. Line 7 declares the main linkidentifier heading, and lines 89 bind to laurelInfo a vector of source links whose targets are theparts of the document of interest for mail transport. Lines 1014 have similar structures: eachconsists of a string followed by a node containing a target link for the label heading and textfor that Laurel "field." Line 11 is additionally tagged as AUTHENTICATED. Lines 1618 containparagraphs constituting the body of the message.Alternatively, the external environment might well contain a definition of laurel60 thatestablishes a suitable environment for a Laurel 6.0 document:1laurel60 _ '2LINKS time LINKS from LINKS subject LINKS to LINKS bodyNodes LINKS cc3LAURELMSG$4cr _ <#13#> tab _ <#9#>5p _ 'PARAGRAPH$ leftMargin_1.0*inch rightMargin_7.5*inch'6justified_F7font.family _ TIMES font.size _ 108margins.left_2540 margins.right_190509leading.x_1 leading.y_1-- overridable default leadings --10printForm _11 '{p ^time tab12 ^from cr13 ^subject crfqX6 f g_p][q2$ gYR gUrqXrREqAOsP1FNP /!M,eK JjI  G *+"FH *+)D *+(CSFB%?@5?d9>!<0;A /!9/!875P g2pqM g0J g.O g- sqs q g+E_ g){/sq g':s q g%0"sKsq g =.sPF E 890 "%]," 2   >Q]oWTowards an Interchange Standard for Editable Documents1514 ^to15 leading.y_616 ^bodyNodes17 ^cc18 }'19heading _ 'LAURELHEADING$ Sub_'TEXT$ LAURELFIELD$' '20body _ 'Sub_'p bodyNodes:' '21'One advantage of using source labels for the "bodies" of the To:, From:, etc. fields (lines1114, 17) is that they can represent sets of nodes as well as single nodes.Now the Laurel document would be described by the following script:22Interscript/Interchange/1.0-- standard heading --23{laurel60% -- invoke Laurel 6.0 definitions24 {heading%-- invoke heading style --25 {time: <18 June 1981 9:18 am PDT (Thursday)>}26 {from: AUTHENTICATED$ }27 {subject: }28 {to: }29 {cc: }30 }31 {body%-- Invoke body style --32 {}33 {}34 {}35 }36} EndScriptInvoking laurel60 in line 23 introduces the quoted expressions heading and body into the rootnode's environment, tags it as LAURELMSG and declares the labels time, from, etc. It alsoacquires a definition for a print form, which could be used to format the message for sendingto a printer. The "%" (indirection) operator indicates that this is intentional structure, to bepreserved by each internalization, rather than merely an abbreviation. Thus the messageheading and body should "see" the effects of any future changes made to laurel60, byediting its definition. By contrast, p is used as an abbreviation; when the script is rendered, itsvalue may safely be copied at each use.Look at the definition of heading (line 19): the right side is a quoted expression sequence.The first expression of the sequence produces the tag LAURELHEADING$ and the secondbinds the quoted expression 'TEXT$ LAURELFIELD$' to Sub. As a result, each subnode of theone beginning on line 24 will be initialized by invoking Sub implicitly from its containing node,which gives each the tags TEXT$ and LAURELFIELD$.Similarly, the definition of body (line 20) defines Sub, and the nodes on lines 3234 will beinitialized by invoking p and having the target link bodyNodes placed on it. Labelling the set ofbody nodes this way means that the source link, ^bodyNodes, in printForm (line 19) denotes theentire sequence of body nodes, in left-to-right depth-first tree order.;fqX6Ff _sxF ]nx [x Z x Xx Vx4 U*x Sx PWq5& NM KC Hs-F Gb +F F +F D3 C@* A+ @~ ?& = <\+F : 9 89 6 5x 2)qsq.sqsq 0_sqsqsq .D ,sq4 *O )4= 'i%stq! %rq" "-sq;  b-s q sqs qsq! $sq% sqs q sqsq# sqsq# )s qsq 1G d >QV![Towards an Interchange Standard for Editable Documents161.6.2. A page of a Star documentThis example is taken from page 71 of the Star Functional Specification and shows onepage of a paginated document with a diagram and a footnote (we recommend that you havethat page in front of you when analyzing this transcription):-- pages 1 .. 6 supposedly precede this one --{pg.a7:Sub_'PARAGRAPH$'{{fn.n1:-- just a unique label: fn: introduced somewhere earlier --FOOTNOTE$}< which has shown our techniques to be valid. Other data can be collected by future changes to your accounting andbilling packages, which will allow us to perform even better analyses and lead to better problem discovery andcorrection.>}{}Sub_'FRAME$'-- change to subnode tag FRAME --{Alignment.horizonally_FlushLeft Alignment.vertically_Floatingheight_2.8*inch width_3.67*inchedges.expandingRightEdge_Tborder_dots1-- change to default subnode environment Rectangle with solid, double width outline --Sub_'RECTANGLE$ lineType.width_2 lineType.style_solid Sub_'Title''LINKS rect-- declare label class to be used below --{rect.a1: UpperLeft_(.0254 .07)shading_7 height_.01 width_.027{} }{rect.a2: UpperLeft_(.073 .015)height_.01 width_.018{} }height_.013-- attribute value shared by following subnodes{rect.a3: UpperLeft_(.02 .03)width_.025{} }{rect.a4: UpperLeft_(.02 .03)width_.028{} }{rect.a5: UpperLeft_(.042 .055)width_.016{} }{rect.a6: UpperLeft_(.067 .055)width_.016{} }-- default subnode environment is LINE with solid, double width outline --Sub_'LINE lineType.width_2 lineType.style_solid'LINKS ln{ln.out1:^rect.a1^ln.in34}{ln.out2:^rect.a2^ln.out1}{ln.in3:^ln.in34^rect.a3}{ln.in4:^ln.in34^rect.a4}{ln.in34:^ln.in3^ln.in4}{ln.out4:^rect.a4^ln.in56}{ln.in56:^ln.in5^ln.in6}{ln.in5:^ln.in56^rect.a5}{ln.in6:^ln.in56^rect.a6}}-- end of Frame1 --Sub_'PARAGRAPH$'-- restore default subnode initialization to PARAGRAPH --{}{}}-- end of page --fqX6 f g_rqr[qF gYV gX=UsF.TVPRPQ:PW%5;OMlL{qXsF)KPI,qXsF?PHYnPG PEPDyPC@qXsF,qXsF5PA2P@P?d "!P>&><;:n 90V7B6 %5*5x%564:%512 ?/1%5 ,0%5 ,/D%5 ,.%5 ,,J+0*N(ss'iss%ss$ass"s!YsssQssssPlP/ ?9PKqXsF*PJ&qXsFIPPOqXsF#PZqXsFP3fS C>QWwTowards an Interchange Standard for Editable Documents171.6.3. Some Star property sheetsHere a few of the definitions invoked in the above example (these were derived from page148 of the Star Functional Specification). Some of them simply give default values for variousattributes; some, like default.font, define a collection of related attributes as an environment;and most are quoted expression sequences for providing abbreviations or "decorating"nodes with tags and their environments with relevant attributes.1.6.3.1. Font-related defaults and definitionsbaseline_0-- the base line for characters --underlined_F-- whether or not text in node is to be underlined --strikeOut_F-- whether or not text in node is to have strike-out line through it ---- there is no rhyme and little reason behind the names of type fonts. The following definition is intended to provide enoughchoice, using standard "terms" to name any existing font in an arbitrary font catalog (of course, it doesn't, but perhaps it isclose enough) --default.font _ [ |-- Definition --family_Times-- a font family name --face_[ |-- Definition --weight_NORMAL-- In (EXTRALIGHT, LIGHT, BOOK, NORMAL, MEDIUM,DEMIBOLD, SEMIBOLD, BOLD, EXTRABOLD, ULTRABOLD,HEAVY, EXTRAHEAVY, BLACK, GROTESQUE) --lineType_SOLID-- In (SOLID, INLINE, OPEN, OUTLINE, DISPLAY, SHADED) --proportions_NORMAL-- In (NORMAL, CONDENSED, EXPANDED, EXTENDED,WIDE, BROAD, ELONGATED) --style_ROMAN-- In (ROMAN, GOTHIC, EGYPTIAN, CURSIVE, SCRIPT) --slant_NIL-- In (NIL, ITALIC, OBLIQUE) --swash_F-- T => use swash capitals --lowercase_T-- T => use lowercase letters --uppercase_T-- T => use uppercase letters --smallCaps_F-- T => use small capitals --]size_10*pt-- distance --]-- some useful font shorthands: --Helvetica _ 'font _ [default.font% | family_HELVETICA]'Italic _ 'font.face.slant_ITALIC'Bold _ 'font.face.weight_BOLD'Helvetica10BI _ 'Helvetica font.size_10*pt Bold Italic'1.6.3.2. Footnote-related definitionsfnCount:=0-- global variable for counting footnotesFOOTNOTE _ 'fnCount:=+1 font.size_8*pt FootnoteRef%'FootnoteRef _ '{FOOTREF$ baseline_+5*pt fnCount}'-- raise 5 pts --1.6.3.3. Paragraph-related definitionsTab _ [ |position_0type_LEFT-- In (LEFT, CENTERED, RIGHT, DECIMAL) --];fqX6Ff _rqr [qN Y> XT VDL Ty@ PWrqXr% Ms dF" K 5 I G G?EqXsF3 E DZ BA @7x> /!Z=v/!Z<'x: 8x9T-&P7x6 3x51x3x2p x1 x/ x.M,  + )W" '8 &! %5 #7 rqXr }s F) 4 1+F rqXr sF1 )o L>QXTowards an Interchange Standard for Editable Documents18MakeTabs _ 'n_0 tabs_(RecursiveMakeTab[Value])'RecursiveMakeTab _ '(EQ[Value 0] | NIL | n_+.25*inch [Tab | position_n ] RecursiveMakeTab[Value-1])'Default.PARAGRAPH _ 'Indent _ [ | Left_0.0 Right_0.0]-- distance --Alignment_FLUSHLEFT-- In (FLUSHLEFT, FLUSHRIGHT, BOTH, CENTERED) --Justified_Fleading_[leading | between_1*pt above_12*pt below_0]charStyle_[|Normal_'font_default.font'Emphasis1_'font_default.font Italic'Emphasis2_'font_default.font Bold']Hyphenation_FKeepOn_NIL-- In (NIL, SamePageAsNextParagraph) --MakeTabs[8]-- binds tabs to a sequence of 8 tabs (0, .25 inch, .50 inch, . . .) --charStyle.Normal-- initializes to normal style1.6.3.4. frame, rectangle, and line definitionsDef.UpperLeft _ 'UpperLeft_(0.0 0.0)'-- Def is just a convenient place to put useful auxiliary definitions --Def.lineType _ 'lineType_[ |Visible_TWidth_1Style_SOLID]-- IN (SOLID, DOT, DASH, DOTDASH, DOUBLE, . . .) --'Def.Shading _ 'Shading_0'Def.Box _ 'Def.UpperLeft Def.lineType Def.Shading'Frame _ 'FRAME$ Def.Box'Rectangle _ 'RECTANGLE$ Def.BoxConstraint_MagnifyOnly-- IN (NIL MagnifyOnly) --'Def.LineEnd _ 'LineEnd_(LeftUpper_Flush RightLower_Flush)-- IN (Flush Round Square arrow1 arrow2 arrow3) --'Line _ 'LINE$ constraint_FixedAngle Def.lineType Def.LineEnd'Title _ 'CAPTION$ Paragraph'1.6.4. Using linksLinks are intended to provide the means for associating nodes in non-hierarchical ways.They can be used for referring to figures, examples, tables, etc., for describing tables ofcontents, for denoting index items, keeping lists, etc. 1.6.4.1. References to figuresThe following outlines how the labelling facilities and global bindings can be used togenerate references to (source links for) a figure whose number may not be known at thepoint of reference. The identifier n5 is assumed to have been generated by the program thatproduced the script and is assumed to be unique over the target labels with naming prefix"figures." in the script.fqX6 f_9sF/]e[54 PZC*+0PX PW6PV! TS_$Q"PPO= PM ''PLX '>qXsFPJ' gFrqXr&D}sF%"HBIP@ ?>&< *+3P;e906442P12%5P/-P,<+,2P*(@&s g"srqXr qW g7G gl8 gJrqXrq8 g 9 gC"rq6 gx? g  g>Q[Towards an Interchange Standard for Editable Documents19LINKS figures figCount:= 0-- should appear in a script's root node --makeFigureNum _ 'HIDDEN$ figCount:=+1 figCount'{. . . ^figures.n5 . . .}-- ref to node with label figures.n5: --{ . . . {figures.n5: makeFigureNum} . . .}-- a hidden node holding the figure number --The node in which the figure number for figure n5 is defined contains a tag, HIDDEN$,which means that the node is not to be considered a part of the dominant structure fordisplay purposes even though it is part of it. The node's sole content is the value of figCountafter it has been incremented by 1. Because figCount is bound with ":=", the scope of thebinding is global.1.6.4.2. Collections of index itemsAssume that the word "diarchy" is to be considered an index item in certain placeswhere it occurs in a document. The link class Indexable should be introduced at the root of thedocument, and each to-be-indexed occurrence of "diarchy" in a string, e.g., , should be replaced by the sequence diarchy% < is established, it . . .>.Somewhere in the script within the scope of the declaration of Indexable, at the root of asubtree containing all the uses of diarchy should be the following definition:diarchy _ '{HIDDEN$ indexable.diarchy: pageNumber} 'Invoking diarchy results in the appearance of a hidden node containing the current pagenumber (assumed to be held in the attribute pageNumber) and labelled as being in the set oftarget links indexable and indexable.diarchy. The index for the document might then contain thefollowing entry for "diarchy":{INDEXENTRY$ ^indexable.diarchy}This entry contains the minimal information needed to generate the sequence of pagenumbers corresponding to indexable occurrences of diarchy. If some occurrences areconsidered primary and some secondary, then these mechanisms can be generalized to havediarchy defined asdiarchy _ [ | primary _ '{HIDDEN$ indexable.diarchy.primary: pageNum} 'secondary _ '{HIDDEN$ indexable.diarchy.secondary: pageNum} ']Primary references are denoted in the script as diarchy.primary% and secondary ones asdiarchy.secondary%. Similarly, the index entry takes the form:{INDEXENTRY$ ^indexable.diarchy.primary ^indexable.diarchy.secondary}1.6.5. Using indirectionsIndirections provide a way to centralize (and delay) the binding of information within adocument. They can be used to share information that is intended to be consistent. 1.6.5.1 Styles and style sheetsDocuments generally follow stylistic conventions for presenting different kinds of content.E.g., major headings may be in bold face with twelve points of extra leading, minor headingsin italic with six points of extra leading. If this information is explicitly bound for each piece of;fqX6Ff _9sF-+ ]/ [(( Yo*+F- V!q#*sq TVH RM r Pq,rq% N JrqXr GbqR E.sq$ C=s Bq%sq @7>sq >mN <sF= 8qsq8 6,s q% 51 sqsq3 3g 1sF* .qN ,%2sq& *1& )4sq &sFPx%|G #q0sq !6sq, sFP rqXr mq8 S rX q0+  D#9  yM   2>Q]5Towards an Interchange Standard for Editable Documents20content, then a stylistic change may require locating and changing all the relevant bindings(note that italic is likely to be also used for other purposes, such as emphasis). If, however,the binding is done indirectly, through a style, a single change will be effective for all placeswhere the style is referenced. Note that each occurrence of a tag implicitly establishes anindirection through the same identifier; this is convenient in associating styles withsemantically meaningful tags. For example:MajorHeading _ 'PARAGRAPH$ Bold leading_+12'MinorHeading _ 'PARAGRAPH$ Italic leading_+6'fqX6 f g_,0 g]K-rq g[*rq/ gYA gW1Q2 gV!*SsF,Rh- RE>QPTowards an Interchange Standard for Editable Documents212. The Language Basis: Syntax and Semantics2.1. GrammarOur notation is basically BNF with terminals quoted and augmented by the followingconventions:a sequence enclosed in [ ] brackets may occur zero or one times;a construct followed by * may occur zero or more times;parentheses ( ) are used purely for grouping.script::=header node trailerheader::="Interscript/Interchange/1.0 "trailer::="EndScript"item::=content | binding | labelcontent::=term | nodeterm::=primary | primary op termop::="+" | "" | "*" | "/"primary::=literal | invocation | indirection | application | selection | vectorliteral::=Boolean | integer | real | string | universalinvocation::=namename::=id ( "." id )*indirection::=name "%"application::=( name | universal ) "[" item* "]"universal::=ucIDselection::="(" term "|" item* "|" item* ")"vector::="(" item* ")"node::="{" item* "}"binding::=localBind | globalBindlocalBind::=name "_" rhsglobalBind::=( name | universal ) ":=" rhsrhs::=content | op term | "'" item* "'" | "[" item* "|" binding* "]"label::=tag | linktag::=universal "$"link::="LINKS" id | "^" name | name ":"2.2. Discussion of Features[Note that we have a formal semantic definition for this language that is every bit as preciseas the grammar above. However, we have not yet figured out how to present it in a form thathumans find equally palatable, so we have placed it in Appendix D.]primary::=literalliteral::=Boolean | integer | real | stringThe primitive elements by which the value of a document is represented.;fpX6Ff _qpq( Z pq VpC T Rrp&PzrpNF rp Jr0pnr H0pnrsr F0n E 0pnr C0pnr A.0pnr ?A0pnr =S0pnrE ;e0pnr- 9w 0pnr 70pnr 5 0pnr 3 0pnr" 10pnr /0pnr prpr -0pnrpr +0pnrpr *0n (0n ' 0n %0pnrpr" #$0pnr !60pnr H0pnrtr RqpXq 'p^ I C r0pnr 0pnr!  pG  g=[ITowards an Interchange Standard for Editable Documents22term::=primary op termop::="+" | "" | "*" | "/"Both the primary and the term must reduce to numbers; the arithmetic operators areevaluated right-to-left (a la APL, without precedence) and bind less tightly than functionapplication. The result is a real if either operand is.invocation::=idId is looked up in the current environment; depending on its current binding, this mayproduce contents, bindings, and/or labels; if the rhs bound to id was quoted, that expressionis evaluated in the current environment. In the (implicit) outermost environment, every id isbound to the corresponding universal (ID).invocation::=name "." idQualified names represent lookup in "nested" environments; name must have been bound toan environment, in which id is looked up.indirection::=name "%"This indicates an intentional indirection through name, which should be preserved as part ofthe structure; replacing the indirection by its value in the current environment is a value-preserving loss of structural fidelity. (An invocation that is simply a name is an abbreviationthat need not be preserved.)universal::=ucIDUniversals are identifiers that are written entirely in upper case letters. They are presumed tobe defined externally, so they are not looked up in the environment (with one exceptionseethe discussion of tags below).application::=( name | universal ) "[" item* "]"If the application involves a universal (either explicitly, or because the name is bound to auniversal), the corresponding function is applied to the argument list that results fromevaluating item*. Part of the definition of Layer 2 will involve the specification of a small set ofstandard functions, which may be expanded in various Layer 3 extensions.If name is not bound to a universal, the current environment is temporarily augmented witha binding of the value of item* to the identifier value, and the value of the application is theresult of evaluating name in that environment; this allows function definition within thelanguage.Neither form of application changes the environment of succeeding expressions becauseitem* is evaluated in a free-standing environment that is thrown away.selection::="(" term "|" item1* "|" item2* ")"This is a standard conditional item sequence, using syntax borrowed from Algol 68. The valueand effect are those of item1* if the term evaluates to "T" in the current environment, thoseof item2* if it evaluates to "F".fpX6 f g_rpr g]Kpr g[prprprp gYLup'rp gWrp gTr pr gQp8 gP1rp rp gNF+-rp gL{rprp gI r pr gFp rp+rp gE rp gAr pr g?dpr prp% g=r p$ g;+r prp g: g6rpr g4^ pJ g27$ g0rp g-Vr pr" g+"prp$rp g)Wrp2 g' rpG g%H#rprp2 g! rprpr p grp@ g.M g/F grpr" gprprp grprp rp" grprp >QW_oTowards an Interchange Standard for Editable Documents23vector::="(" item* ")"Parentheses group a sequence of items as a single vector; bindings affect the environment ofitems to the right in the containing node, but labels have no meaning. node::="{" item* "}"Nodes have nested environments, and affect the containing environment only through global(:=) bindings to ids. Item* is implicitly prefixed by an invocation of Sub, which may be boundto any sequence of items intended to be common to all subnodes in a item.item*::=""The empty sequence of items has no value and no effect; this is the basis for the followingrecursive definition.item*::=item1 item*In general, the value of a sequence of items is just the sequence of item values; binding itemschange the environment of items to their right in the sequence.localBind::=name "_" rhsThis adds a single binding to the current scope (i.e., to its associated environment); bindingshave no other "side effects" and no value (i.e., they do not change the length of a containingvector or node value).globalBind::=( name | universal ) ":=" rhsThis adds a single binding to the outermost environment X. It makes sense to bind somethingto a universal only if the universal is a tag name (see tag below).binding::=name mode op term"name mode op term" is just a convenient piece of syntactic shorthand for"name mode name op term".mode::="_" | ":="A value can be bound to a name either locally ("_") in the environment of the node in whichthe binding appears, or globally (":=") in the environment of the root node of a script.rhs::="'" item* "'"A quoted rhs is evaluated in the environment of invocation, rather than the environmentcurrent at the point of binding.rhs::="[|" binding* "]"This creates a new environment value that may be used much like a record.rhs::="[" item* "|" binding* "]"This creates a new environment value that is an extension of the environment that is thevalue of item*.;fpX6Ff _r0pnr \prp rprp [rprprprp Wr0pnrpr Upp+rp Srprprpr prp Qrp rp1 Nir0pnr L5prp2rp Jj Fr0pnr Dp&rprpr Bprp ?r0 =Sprp<r ;prp: 9rprp 6Kr 0n 4prpvp 2L8rp .r0pnr ,p6 *rp 'ir0pnr %5prprp# #jrprp2 r0pnr prpr p rp r0pnr SpI r0pnr pF rp >QYpWTowards an Interchange Standard for Editable Documents24tag::=universal "$"This gives the containing node the property denoted by the universal. It also looks for abinding to the universal in X, the outermost environment; if one exists, it is invoked in thecontext of the current environment. This gives an easy way to attach a tag to a node andprovide a set of defaults associated with the tag.link::="LINKS" idThis introduces the link set whose main component is id, and defines their scope.link::="^" nameThis identifies the immediately containing node as a source of the link name (like a reference tothe set of nodes which are link targets).link::=name ":"This identifies the immediately containing node as a target of each of the links that is a prefixof name. For example, the link target "id1.id2...idn:" would make the node containing it atarget in the link sets for id1, id1.id2, ..., id1.id2...idn.2.3. Safety Rules for Low-capability EditorsInterscript claims to make it possible for editors to manipulate the parts of documentsthey understand without harming parts they do not. This section develops a set ofconservative rules for editor treatment of script nodes created by other editors.We first need to define some terms. The implementor of an editor is said to understand atag, T, if(1)she knows the set of attributes and contents that are relevant to T, and(2)she knows all the invariants among attributes that must be maintained for anode with tag T.An editing system is said to understand a tag, T, if(1)it is able to provide some rendition (display) of a node with tag T; and(2)it allows insertion or deletion of direct subnodes of that node.An editing system is said to implement a tag if(1)it understands T; and(2)it is able to alter a node with tag T.Finally, an editing system is said to fully implement a tag if it is capable of changing anyattribute relevant to T or any contents of a node with tag T.fpX6 f g_rpr g\prprp g[rpvp. gYLP gW2 gTrprtr gQp5rp gNirpr gL5p*rprpr gJj(p gFrpr gDpC gBrp#BlvBpBlvBpBlvBp$ gA.@vA.p@vA.p@vA.p@vA.p@vA.p@vA.p g<\qpXq'8p@ g7O  g5UQ1$)u g0p-P1+P:]) p&sup$>P= " P@ pupcP/P& p&upu ppup >QS`=Towards an Interchange Standard for Editable Documents25With these definitions, we can now give some conservative rules for editors in treatingparts of documents corresponding to nodes in a script:It's OK for an editor to display a node ifit understands at least one of its tags.It's OK for an editor to edit within a node ifit implements all of its tags, and either(a) doesn't remove any of them, or(b) also understands all tags of its parent.It's OK for an editor to copy a node ifit understands all the tags of the node's new parent,no labels are moved outside their scope, andthe two environments have the same bindings for all attributes that the editoreitherdoesn't understand, orknows can't be relevant anywhere in the node or its subnodes.It's OK for an editor to delete a node ifit understands all the tags of its parent.[Less stringent rules will suffice if the document is merely to be viewed, rather thanedited, using the original editor.]2.4. Encodings[Any resemblance between the following material and the corresponding section of theInterpress standard is purely an intentional consequence of plagiarism.]The script for a document can be encoded in many different ways. This section gives therules for designing encodings. The purpose of these rules is to ensure that information is notlost or added by conversions from one encoding to another. There are two types ofencodings: a single interchange encoding and many possible private encodings. The interchange encoding is used to transmit a script from one site to another when thetwo sites must be assumed to be arbitrarily different. A private encoding is used to transmitscripts from one site to another when the two sites share the private encoding conventions.For example, a line of document-preparation products made by the same manufacturer mightshare a private encoding, which can be used to transmit documents from one editor in theproduct line to another; presumably this encoding is designed to make these transfers simpleror more efficient. However, when one of these editors transmits a document to an unknowneditor, the interchange encoding must be used. The interchange encoding is designed toallow easy generation, transmission, and interpretation by many different editors, possibly atthe expense of compactness and speed of encoding and decoding.;fpX6Ff _'0 ]K6[*X(V.Ty upRE"PupM'Kup#Is,G?3ECcA.=>)<up 9TA 7# 2qpXq /hp3! -H *r7 (*4 & D %N !: L  = @9 u+- #: J J K[ >  9>QVTowards an Interchange Standard for Editable Documents262.4.1. The interchange encodingThe interchange encoding is designed to simplify creation, communication andinterpretation of scripts for the widest possible range of editors and systems. For this reason,a script in the interchange encoding is represented as a sequence of graphic (printable)characters taken from the ASCII set; the subset of ASCII used is also a subset of ISO 646.Communication of a script in the interchange encoding requires only the ability tocommunicate a sequence of ASCII characters; Interscript does not specify how thecharacters are encoded. In effect, we define a text representation of the commands to beexecuted. The choice of a text format for the interchange encoding leads to rather lengthy scripts insome cases. The bulk of an interchange script presents no great problem for documentstorage, since a document need not be stored in this form. Rather, as it is transmitted, thesending editor can translate its own private encoding into the interchange encoding.Similarly, the receiving editor can translate the interchange encoding into its own, usuallydifferent, private encoding for storage. However, a bulky interchange script may be moreexpensive to transmit. If a document consists mostly of text, the interchange encoding isquite efficientvery few characters are required in addition to those appearing in thedocument itself.Character set. The character set used in the interchange encoding is described by theISO 646 7-bit Coded Character Set For Information Processing Interchange. The interchangeencoding interprets the 94 characters of the G1 set defined in the International ReferenceVersion (ISO 646, Table 2) and the space character (2/0). This set of 95 characters is calledthe interchange set. Note that except for the concise "string" encoding of vectors describedbelow, the interchange encoding has nothing to do with the integers corresponding to thecharacters, but depends only on the character set itself.It is extremely important to understand that the choice of the ISO standard forthe interchange format has nothing to do with character mappings in Interscriptfonts. Although these mappings must adhere to a character set standard that is shared byinterchanging editors, that standard is not part of Interscript. It is expected that Xerox willdevelop a separate corporate standard in this area.If the underlying encoding of the ISO character set can also encode other characters(e.g., the control characters (0/0 through 1/15) and del (7/15), or another group of 128characters if eight bits are being used to encode each character), these are ignored ininterpreting an interchange script. This does not mean that these characters are converted tospaces, but that they are treated as if they were not present. There are several reasons for this choice:Control characters may be inserted freely by software that generates the interchangeencoding. For example, carriage returns (0/13), line feeds (0/10), and form feeds(0/12) may be inserted at will to conform to limitations that may be imposed by anoperating system. Restrictions on line length or the use of fixed-length records thusbecome straightforward.Control characters may be removed or inserted freely by software that receives thefpX6 f g_upu[pT:U gYO gXD gVD$6 gTy*R gR)L* gP3% gO KP gI: gH4( gFHJ gD}-/ gB(0 g@V g?E g=S9u p4 g8M g6KZ g4C g2rprp g01' g/!9+qO g)C g(p%. g&OI g$3!7 gH8 g}6! g6' g?v*]B"2]2]A]L0%] ] p$. )>Q\ETowards an Interchange Standard for Editable Documents27interchange encoding. In this way, the receiving software can adhere to anyrestrictions imposed by its operating system.The absence of control characters allows certain kinds of "non-transparent" datacommunication methods (such as binary synchronous communication) to be usedfreely.A minor disadvantage of these conventions is that if a script is typed in, care must betaken not to omit a significant space at the end of a line. Since scripts are normallygenerated by programs, this is not important. A system for manually generating (and perhapsinteractively debugging) Interscript should provide for various convenience features on input,and for prettyprinting the script on output.Any number of space characters may also be added after any token without changing themeaning. Throughout the following, a delimiter is a space or comma, which may be omitted ifthe next character is not an alphanumeric, "" or ".".VersionId. The first characters of an interchange script conforming to this version of theInterscript standard must be "Interscript/Interchange/1.0 ". Note that the VersionId is ofvariable length, and ends with a space. These conventions simplify the design of systems thatmust deal with more than one kind of encoding.If a privately encoded script can be interpreted as a sequence of characters, its firstcharacters must be "Interscript/private/i.j", where private is replaced by anappropriately chosen hierarchical name that identifies the encoding, e.g., "Xerox/860",and i.j is replaced by an appropriate version identification, e.g., "2.4"; the resultingheader would be "Interscript/Xerox/860/2.4".A private encoding that cannot be interpreted as a sequence of characters (e.g., abinary, word-oriented encoding on a 36-bit machine which packs five 7-bit charactersinto a word) should use any available convention to make its scripts self-identifying.Following the versionId is a node constituting the body of the script which is in turnfollowed by the trailer of a script, "ENDSCRIPT". The body of the script contains valuesencoded as follows.Integer. An integer is represented in radix 10 notation using the characters "0" through "9"as digits, followed by a delimiter. A negative integer is preceded by a minus sign "". Thusthe decimal number 1234 is encoded as "1234", and 1234 is encoded as "1234". Thetrailing delimiter may be empty if the following character is a letter.A sequence of integer literals in the range 0..255 can be represented in radix 16 notationusing the characters "A" through "P" as digits ("A" corresponds to 0, "P" to 15). The entiresequence is enclosed in "#" brackets. For example, the integer 93 is represented as"#FN#", and the sequence of integers 93, 94, 95, 96 as "#FNFOFPGA#". Thesesequences require only two characters for each integer (plus two characters of overhead).Note that there is no delimiter between the integers in this encoding. Booleans are represented by the characters "F" and "T", followed by a delimiter.;fpX6Ff_")]n-[:9rprpY-W TyF RM P6% O7' MO, I H Hrp- FHr prprp BrpB A rsprp ?AX =v.;A>9irprpj7Kr p6Krprprp4rp2pP0#1/!0& + rprp rp )&vp ( $rprp9rpr "p rp rprp !rprp Hrprp(  rpE  rprp rprp Arp rp vrprpr p /rp rp rp orp#rprprp (>QXvTowards an Interchange Standard for Editable Documents28Real. A real is represented using Fortran E or F notation, with a trailing delimiter. Thus"12.34" is the same as "1.234E1". Minus signs may precede the mantissa or the exponent:"12.34E3 ".Identifier. An identifier is encoded by its characters (which are limited to letters anddigits), followed by a delimiter: "x", "arg1". The first character of an identifier must be aletter, and must be written in lower case to distinguish identifiers from universals. Other lettersmay be written in either case for readability, since case is not significant in distinguishingidentifiers.Vector. A vector is encoded by surrounding a sequence of values with parentheses, "("and ")".String. A text vector usually contains integers that are interpreted as character codes. Oftenthese codes lie in the range 32 to 126 inclusive, which are the numbers assigned to thecharacters of the interchange set by ISO 646. It is convenient to encode an element of sucha vector by the character whose ISO code is the desired value. Such a string can be encodedby surrounding the characters with "<" and ">", thus "". If the string containselements outside the allowed range (i.e., if the value is less than 32 or greater than 126) orthe value 62 or 35 (the ISO codes for the characters ">" and "#"), those elements must berepresented as integers inside "#" brackets, as described above. The two-characterencoding of small integers is designed to make escape sequences compact. Thus "","", and "" are all equivalent.Universal names. A universal is encoded by giving a name that begins with an uppercaseletter followed by zero or more uppercase letters or digits, followed by a delimiter. E.g.,"TEXT", "XEROX860 ".Node. A node is encoded by a "{", followed by a sequence of items, followed by a "}". Comment. The beginning and end of a comment are both marked by a double minus sign:the sequence "" "" is a comment and may occurbetween any two tokens. Comments are ignored in rendering the script.fpX6 f_rp6rp g]Krpr p7 g[r pXr pr p( gVDrprprpr p gTy9r pr p gRX gPr pMrrprpBr gKprpH6rprprp/ gFk L gDK gBrp=rp gA #rprpr prp g?A0. g=v5rp g;rp  g9rp.r p g8r prp4rprp rp g24rp g1rp-rprprprprp*+rprp( g(` r*prp g&rprprp8 &O>QA'Towards an Interchange Standard for Editable Documents29The tokens of the interchange encoding are defined by the following BNF grammar,together with rules about delimiters:The delimiter that terminates an identifier or universal may only be empty if the nextcharacter is not an alphanumeric, or "".The delimiter that terminates an integer may only be empty if the next character is nota digit, "E", "F", "", or ".".extra delimiters may be inserted after any token.token::=literal | id | ucID | op | bracket | punctuation | commentliteral::=Boolean | integer | real | stringBoolean::=( "F" | "T" ) delimiterdelimiter::=" " | "," | emptyempty::=""integer::=[ "" ] digit digit* delimiterdigit::="0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"real::=[ "" ] digit digit* "." digit* [ "E" integer ] delimiterstring::="<" stringElem* ">"stringChar::= any character but "#" or ">" stringElem::=stringChar | hexSequencehexSequence::= "#" hex* "#"hex::= hexChar hexCharid::=lowerCase idChar* delimiteridChar::=letter | digitletter::=lowerCase | upperCaselowerCase::="a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | l" | "m" | "n" |"o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"upperCase::=hexChar | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"hexChar::="A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" |"N" | "O" | "P"ucID::=upperCase ucIDchar* delimiterucIDchar::=upperCase | digitop::="+" | "" | "*" | "/"bracket ::="(" | ")" | "{ " | "}" | "<" | ">" | "[" | "]" | ""'punctuation::="." | ";" | ":" | "=" | "_" | "!" | "%" | "|"comment::="" commentString ""commentString::= any sequence of characters not containing "" A simple listing of an interchange script can just print the character sequence, with linebreaks every n characters, or perhaps at the nearest convenient delimiter. Such a listing isreasonably easy to read, so that problems can be tracked down simply by studying it.Additional help in reading the file can be furnished by utility programs which format the filefor more pleasant reading.2.4.2. NormalizationEvery encoding must define a normalization function N, which maps a script in theencoding into another script in the encoding which generates the same output. N must be;fpX6Ff _rp* ]Kr p[rpr prp rp Yor prpW;rprprpUrprprpS_r prp Or0n: N#0n! LX0n J0n H0n F0n E-0n; Cc0n: A0n ? 0n" > 0n <8 0 :n0 80n 60n 50n 3C0n6n1yB /0n 8 -0n%*n, *N0n (0n &0n $0n2 #$ 0n. !Y0n  4 pZ R?rp P #;  upXu ^pC  2%   L>Q\yTowards an Interchange Standard for Editable Documents30idempotent (i.e., N2=N); it must not change the fidelity level of the script (see 2.4.3). If ascript violates the definition of Interscript, a normalization function may report this factinstead of producing a normalized result. In other words, normalization need not be definedon erroneous scripts.The purpose of this function is to make possible a precise description of the rules forprivate encodings in section 2.4.4. The idea is that when an encoding provides several waysof saying the same thing (typically a basic way, and some more concise ways which work incommon special cases), the normalized script will uniformly choose one way of saying it.Note that the normalized script is not intended for any purpose other than precisely defininga notion of equivalent script; it is neither especially compact nor especially readable.The normalization function for the interchange encoding is defined as follows:Comments are omitted.Delimiters are replaced by empty if possible, otherwise with ",".Leading zeros are dropped from a digits encoding of an integer.Reals are uniformly encoded in E format with a single non-zero digit to the left of the"." and no trailing zeros; 0 is encoded by "0.0".An upper case letter in an identifier is replaced by the corresponding lower case letter.Each direct invocation (abbreviation) is replaced by its binding. 2.4.3. Level restrictionFor each internalization fidelity level L of Interscript, there is an (idempotent) levelrestriction function RIL which converts an arbitrary interchange script into an interchangescript of level L. An interchange script is of level L if RIL applied to it is the identity. Arestriction function replaces an excluded structure with its value according to the semanticsof Interscript, converts excluded form information into additional content with a specialproperty, and removes excluded tags.2.4.4. Private encodingsA private encoding may use any scheme for expressing the content of a script. Certainrequirements are imposed on private Interscript encodings to ensure that they can expressthe entire content of a script at a given level, and no more. Since no general statements canbe made about the bits, characters or other low level constituents of a private encoding,these constraints are stated in terms of the existence of certain functions that convert privateencodings to interchange encodings and vice versa. An encoding for which these functionsdo not exist is not an Interscript encoding. The recommended way of demonstrating that thefunctions exist is to exhibit them as executable programs. This makes it easy to run testcases.A particular private encoding has a fixed fidelity level. Informally, this means that it canencode any script of that level.fpX6 f g^_9v^pC g\> g[P gYLU2% gTN gRE+. gPzB gNJ gLXIsN]G?rp ]E r prprp]B!rprp]@rp6rp]>rp(rp]< rpr p*rp]: r p#rp g6oupXu2pSu g12p; g/h\ g-I g+O g*rp g%upXu"sp), g C gF gB gIK g~> g N gD g P g >QYp0Towards an Interchange Standard for Editable Documents31For any private Interscript encoding P of fidelity level L, the following functions must exist:NP, the normalization function for P; see 2.4.2.CPI, a conversion function from a script in P to an interchange script of level L.CIP, a conversion function from an interchange script of level L to a script in P.If a script violates the definition of Interscript, a conversion function may report this factinstead of producing a converted result. In other words, conversion need not be defined onerroneous scripts.Given these functions, we can define functions which convert normalized private scriptsto normalized interchange scripts of level L and conversely:NPI=NIoCPINIP=NPoCIPIn other words, first convert to the other encoding, and then normalize. These functions mustbe inverses of each other. This means that after normalization (which does not change the output), a private scriptcan be converted to an interchange script and then back to the same private script, and viceversa. Hence it seems reasonable to say that the private encoding can express exactly thesame information.Many tricks are available for designing private encodings with desirable properties. Withsome knowledge of the statistics of actual scripts, encodings can minimize the number of bitsrequired to represent the average script, by Huffman or conditional coding of the primitives.For example, if strings consist primarily of ordinary written English text, an encoding with fivebits per character might be attractive: lower case letters except "q", "x", and "z" (23), space,comma space, semicolon space, colon space, dot space space one upper case character,escape to upper case, one upper case character, escape to digits, one digit character (32total). The upper case and digits sets would be analogous. A more complex, but perhapseven more compact encoding would take account of the letter frequencies in English text.Similarly, the most common labels can be encoded compactly.There are other useful ideas for private encodings. The bracketting constructs may bereplaced by constructs with explicit length fields; these can be shorter, it is easy for thedecoder to skip the bracketted constructs, and if the script is damaged it is easier to recoverthan from the loss of a closing bracket. Hints can be associated with nodes that will speedtranslation to a particular editor's representation.In designing a private encoding, it is advisable to handle all the constructs of Interscriptreasonably compactly, rather than allowing some "unpopular" ones to be encoded veryclumsily. Otherwise scripts originally generated in another encoding may cause terribleperformance.;fpX6Ff _8'\0ZAXxA U&8 TD RE N4# M<JspHsp FU D7 @/) >C =/,- ;e 79 6(X 4^C 2B 0-rprprprprprp .** -3J +i< )8 '; $a5 "F  .1 [ 74 #9 N 0< e  >QSTowards an Interchange Standard for Editable Documents323. Higher-Level Issues3.1. Standard and Editor-Specific Transcriptions:We need a two-level structure for documents expressed in the base language to be both(a) interchangeable among different editors, and (b) retain information of special significanceto a specific editor. We call (a) the interchange standard information, or standard informationand (b) editor-specific information.Basically, an editor X is free to couch properties in its own terms, which can make it easyfor it to consume a script produced by itself, but it must provide a set of mappings which willtransform properties into the interchange standard. The recommended method for doing thisis to invoke its name as the very first item in the root node of any X-specific subtree. Therules for inheritance of properties mean that often only the root node of a document will needto have this property, but there is nothing wrong with nodes being in different editor-specificterms provided they invoke the appropriate editor properties. Now, to be a valid standard script, the document must have the definition of the name Xplaced in the script itself (There is nothing wrong with having libraries of editor-specific _standard mappings in a library of some sort to avoid having copies of them in each script). When X parses an X-specific script, it will use its X-specific attributes and never invokethe mappings from X-specific information to standard terms; i.e., it can use a null definitionfor the name X. However, when such a document is interpreted by some other editor Y, anytime it tries to access a standard name, the mapping from that name to the correspondingexpression in terms of the X-specific values in the script will have been provided by thedefinition of X. What guarantee is there that this can always be done?It is worth noting first that we are speaking here of a script being internalized by aneditor, Y, rather than being externalized. Consequently, it is never necessary to accessstandard names in left-hand contexts; i.e., to do bindings that are not part of the script inorder to interpret it. Y may, however, need to access components of environments in order tointernalize the script for itself. These are always values in right-hand side contexts, and mustbe computed in terms of the X-specific information that X put in the script. We can examinethis issue on a case-by-case basis. Below is a list of examples of possible editor-specific usesof the base language and the mappings that would allow another editor to treat the documentin standard terms: Symbolic values used instead of numbers:supply standard values for the symbolic values:Standard:lineLeading _ 1*pt-- some numeric value --Editor-specific:lineLeading _ singlemapping:single = 2*ptfpX6 f g_qpq gZ pq,VpP gTR gSG gQN$MupE gLY gJGT gH|Eup gF^ gD_ gC>?Uu g=p8%s g<p&58up up"up g6up@ g5 up1up g3CQ g1yup= g/ up7,< L g*rupE g(:# g&up1 g%Q g#Gupup" g!}K g P g guu'p/P'tF'PptPpt  >QT30Towards an Interchange Standard for Editable Documents33Different names used for standard names:supply a binding to the standard name from the editor-specific name using a quotedexpression so that it is only evaluated when needed in a righthand context:Standard:lineLeading _ 2*ptEditor-specific:lineSpace _ singlemapping:lineLeading _ 'lineSpace'Different concepts used for standard ones:supply a binding to the standard attribute names from the editor-specific concepts usingquoted expressions so that they are only evaluated when needed in righthand contexts:Standard:lineLeading _ 2*ptEditor-specific:lineSpacing _ [fontSize_10 on_14 leading_1]-- lineSpacing units assumed to be pts --mapping:lineLeading _ 'pt*Spacing.onSpacing.fontSize'-- compute result in standard units --In general, one can use the facilities of the base language to write essentially arbitraryprograms that can be bound as quoted expressions to a standard identifier to cause theappropriate value to be computed based on editor-specific information put in the documentby the editor that externalized it. Moreover, since the mappings provided by editor X can beoverridden in any subtree of the document, an editor that does not "understand" somesubtree of a document produced by another editor Y can simply leave that subtree intactwhen producing an edited version of the original script except to ensure that that subtree'sroot node's first expression is an invocation of "Y", which will cause Y's editor-specificmappings to obtain in that subtree.3.2. Standard External EnvironmentIt is important to provide for a standard external environment for rendering scripts so thatstandard definitions need not be carried along with every script that uses them. The externalenvironment contains definitions for units (inch, pt, etc.), various "styles" (para, figure, etc.),and useful abbreviations (italic, bold, etc.).3.2.1. UnitsThe Interscript standard assumes that distances are in meters and angles are in degrees.Using the language and the following constants defined in the standard external environment,a script can readily express distances and meters in other, possibly more convenient units:meter=1.0-- IN TERMS OF METERS --mica=1.E5*meter-- mica= 1.E5inch=2540*mica-- inch= 2540 --pt=.013836*inch-- pt= 35.143 --pica=12*pt-- pica= 421.752 --tenPitch=inch/10-- tenPitch= 254 --twelvePitch=inch/12-- twelvePitch= 211.667 --degree=1.0-- ANGLES ARE IN DEGREES --pi=3.14159265radian=180*degree/pi-- = 57.29577951 --;fpX6Ff _u'p ]KK [LYdtFXxpdtVpdt Su)p Q9 O<NidtFLpdt-52)Kapdt.52& Gp O F$L DZY B@up @8 >!up% =/\ ;eupup 9# 4qpXq 12pB /h3* -M +. 'upXu $>p/) "sO  D .t# d&P & d&P d&P  d&P d &P d &P  #  # >QY)PAppendices: Glossary , Arbitrary Choices, Formal Semantics34APPENDIX AGLOSSARYItalics indicate words defined in this glossary.abbreviationAn invocation used to shorten a script, rather than to indicate structureattributeA component of an environment, identified by its name, which is bound to a value base languageThe part of the Interscript language that is independent of the semantics ofparticular properties and attributesbase semanticsThe semantic rules that govern how scripts in the base language are elaborated todetermine their contents, environments, and labels bindingThe operation of associating a value with a name to add an attribute to anenvironment; also the resulting association binding modeA value may be bound to an identifier as local, const or globalBooleanAn enumerated primitive type (F, T) used to control selection and as primitivevalues const bindingA binding of an attribute that prevents its being rebound in any contained scope contentsThe vector of values denoted by a node of a script definitionAnother name for a const binding documentThe internalization of a script in a representation suitable for some editordominant structureThe tree structure of a document corresponding to the node structure of its script editor-specific nameA non-standard name used by a specific editor in scripts it generates; an editormay use editor-specific terms without interfering with the interchangeability of ascript if it provides definitions of the standard names in terms of its editor-specificnames elaborate(verb) To develop the semantics of a script or a node of a script according to theInterscript semantic rules. This is a left-to-right, depth-first processing of the script encodingA particular representation of scripts environmentA value consisting of a set of attributes. An environment may be either free-standing ornodal. A free-standing environment is a structured value much like a record, with thecomponents being the attributes of the environment. A nodal environment is associatedwith a node of a script and represents the attributes bound in that node. expressionA syntactic form denoting a value external environmentA standard environment relative to which an entire script is elaborated externalizationThe process of converting from a document to a script; also the result of that processfidelityThe extent to which an externalization or internalization preserves contents, form,and structurehexIntA component of a hexSequence formed from a pair of letters in the set{A,B,...,O,P}, and representing an integer in the range [0..256) hexSequenceA sequence of hexInt pairs enclosed between "#" pairs and used to encodecharacters in string literals, e.g., #ENCODE#hierarchical nameA name containing at least one period, whose prefix unambiguously denotes thenaming authority that assigned its meaningidentifierA sequence of letters used to identify an attribute integerA mathematical integer in a limited range; one of the primitive types interchange encodingThe standard encoding for scripts internalizationThe process of converting from a script to a document; also the result of that processfqX: f':](prp'Yr pVgsq) gRp tu tut# gPptu tutut gNp t u tMr u tu gKp tutu tJutut gHptutut utFu t! gDp tutu tu tu gBpt ututuA.t g?Ap tututut g=SptFutut utut g;ep tu t g9wptutut- g7ptututut g5pt utut4::ut2u t31 g/pt%utut.*u tqt2  g,Q]LTowards an Interchange Standard for Editable Documents35InterscriptThe current name of this basis for an editable document standard invocationThe appearance of a name in an expression, except as the attribute of a binding labelA tag, or a source, a target, or a link introduction placed in a node linkThe cross product of a source and a target; in general, a link is a set of (source,target) pairs; in the special case when there is exactly one source and one target,a link behaves like a directed arc between a pair of nodes link introductionThe appearance of LINKS id in a node, where id is the main identifier of a linkliteralA representation of a value of a primitive type in a script local bindingA binding of a value to a name, causing the current environment to be updatedwith the new attribute; any outer binding's scope will resume at the end of theinnermost containing node nameA sequence of identifiers internally separated by periods; e.g., a.b.c nested environmentThe initial environment of a node contained in another node NILA name for the empty value; it does not lengthen a vector or node in which it appears nodeEverything between a matched pair of {}s in a script; this generally represents abranch point in a document's dominant structureNULLIdentifies the empty environment; the value it associates with any identifier is NIL OUTERA standard attribute of every environment:For a free-standing environment (i.e., a record-like, structured value),OUTER=NULLFor a nodal environment, OUTER's value is the environment of the currentnode's parent just prior to the start of the current node.For the root node of a document, OUTER=X.For X, OUTER=NULLglobal bindingA kind of binding (indicated by ":=") that modifies the environment of the rootnode of a document only, and hence may endure beyond the end of the currentnode and may be seen by nodes to the right of the current node, even those nothierarchically descended from the current node. primitive typeBoolean, Integer, Real, String, or Universal primitive valueA literal or a node, vector, or environment containing only primitive values private encodingOne of a number of non-standard encodings of a script propertyEach tag on a node labels it with a property; the properties of a node determinehow it may be viewed and edited quoted expressionA value which is an expression bracketted by single quotes ("'"); the expression isevaluated in each environment in which the identifier to which it is bound isinvoked realA floating point number scopeThe region of the script in which invocations of the attribute named in a bindingyield its value; the scope starts textually at the end of the binding, and generallyterminates at the end of the innermost containing node scriptAn Interscript program; the interchangeable result of externalizing a document selectionA conditional form in a script that denotes one of two expressions, depending onthe value of a Boolean expression in the current environmentsourceThe set of nodes with REF link, which thereby refer to the set of target links. stringA literal which is a vector of characters bracketed by "<>", e.g., styleA quoted expression to be invoked in a node to modify the node's environment,labels, or contents SubA standard component of each environment, which is implicitly invoked to initializenested environments ;fqX6Ff _p tF@v ](p tutu tutut [:ptututututu t ut YLptutut)WCV5ut Tptvtutu tu RptFutu tut Pp tutututu tO` ututMut LptF u t. J#pXtFu tut H6rtututut FHpt.ut Du tu Brtu tutu tut @rt ut u ?tQutR= <8utu t:5ut9T'ut7ut 5p tut2 4&%3C*$11 /pX utFutututut .ptutututu t! ,pXtFutut *+ptututut utu tut (! &ptut u t2%Xu t u tut#ut " ptF  ptut u tut ut utututZ2ut lptu t u tut ~ptutu t utu  ptF utvut$u t ptut ut2 1ptututu tutut ptu t  ;ut &  ?Q[_Appendices: Glossary , Arbitrary Choices, Formal Semantics36SUBSCRIPTA function that can be used to extract a value from a vector,e.g. SUBSCRIPT[(a b ), 3] is the value tagA universal name labelling a node using the syntax universal$; the properties of anode correspond to the set of tags labelling it targetThe set of nodes labelled with link: transparencyA characteristic of scripts that allows an editor to identify the nodes of a scriptthat it understands and thereby enables it to operate on those nodes withoutdisturbing the ones that it doesn't understand UnitsA set of definitions relating various typographical and scientific units to theInterscript standard units, meters; e.g., inch=2.54E2*meter, pt=.013836*inch universalAn identifier formed entirely of uppercase letters and digits valueA primitive value, node, vector, environment, universal, or quoted expression vectorAn ordered sequence of values that may be subscripted XThe standard outer environment for an entire script; the value of an unboundidentifier in X is the universal consisting of the same letters in upper casefqX: f g_rtF6ut]2 g[ptututu tZ 0 gX2ptF ututut gVDp tut"ut T8S0 gQptu t;Pu t, gN#ptu t1 gL5ptFutututu tutut gJGptut u t gHYrtu tututFu t utT F>Q!}Towards an Interchange Standard for Editable Documents37APPENDIX BARBITRARY CHOICES"One of the primary purposes of a standardis to be definitive about otherwise arbitrary choices."There are many places in this proposal where we have made an arbitrary choice fordefiniteness. It will be important that the ultimate standard make some choice on thesepoints; it matters little whether it is the same as ours. To forestall profitless debate on thesepoints, we have tried to list some of the choices that we believe can be easily changed at alater date:Encoding choices:The choice of representations for literals (we generally followed Interpress here).The selection of particular characters for particular kinds of bracketting, and forparticular operators.The choice of infix and functional notation for the interchange encoding (as opposed,e.g., to Polish postfix).The choice of particular identifiers for basic concepts.Linguistic choices:The choice of a particular set of basic operators for the language.The particular set of primitive data types (we followed Interpressits set seems aboutas small as will suffice).The choice of particular syntactic sugars for common linguistic forms.;fqX6Ff%](prp#9YrFprVwi*{U7 REq< P.sq O=3. M? L5 IssFq*)CSBl?/&>&;e8 8s5qC3 P1.F .=9xAppendices: Glossary , Arbitrary Choices, Formal Semantics38APPENDIX CFORMAL SEMANTICSC.1. GrammarOur notation is basically BNF with terminals quoted and augmented by the followingconventions:a sequence enclosed in [ ] brackets may occur zero or one times;a construct followed by * may occur zero or more times;parentheses ( ) are used purely for grouping.script::=header node trailerheader::="Interscript/Interchange/1.0 "trailer::="EndScript"item::=content | binding | labelcontent::=term | nodeterm::=primary | primary op termop::="+" | "" | "*" | "/"primary::=literal | invocation | indirection | application | selection | vectorliteral::=Boolean | integer | real | string | universalinvocation::=namename::=id ( "." id )*indirection::=name "%"application::=( name | universal ) "[" item* "]"universal::=ucIDselection::="(" term "|" item* "|" item* ")"vector::="(" item* ")"node::="{" item* "}"binding::=localBind | globalBindlocalBind::=name "_" rhsglobalBind::=( name | universal ) ":=" rhsrhs::=content | op term | "'" item* "'" | "[" item* "|" binding* "]"label::=tag | linktag::=universal "$"link::="LINKS" id | "^" name | name ":"C.2. Notation for environmentsEnvironments bind identifiers to expressions, in various modes ("=", ":=", "_"):NULL denotes the "empty" environment[E | id _ e] means "E with id bound to e"locVal(id, E) denotes the value locally bound to id in ElocVal(id, NULL) = NIL = ""locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)fqX: f':](prp$Yrpr gTpqpQqqC gO ]Mrxq&]K>xq]I xq gExqx gCqxyx gA g?qx g=qx g;qx g:qx g8qxE g6(qx- g4: qx g2Lqx g0_ qx g.q qx" g,qx g*qx qxqx g(qxqx g&qxqx g$ g#G g!  gqxqx" gqx gqx g qxvx gpX gqP]ftq]sqsqsqsqsq sq] ^sqsqsq%sqsS qsqtqtqS Vsqsqsqsqsqsqsqsqsqsqsq ;]mTowards an Interchange Standard for Editable Documents39C.3. Semantic functionsR: expression, environment --> expression-- ReductionR is used for evaluating right-hand sides: identifiers, expressions, etc.C: expression --> expression-- ContentsC is basically used to indicate which evaluated expressions become part of thecontent of a node B: expression, environment --> environment-- BindingsB indicates the effect a binding has on an environment. B and R are mutuallyrecursive functions (e.g., the evaluation of an expression may cause some bindingsto occur as well)The following four semantic functions occur less frequently in any substantive way in thesemantics below. You might wish to skip them until they occur in a nontrivial manner in thesemantics.T: expression --> expression-- TagsT indicates when an identifier is to be included in the tag set for a nodeL: expression --> expression-- LinksL indicates link declarationsLs: expression --> expression-- Link sourcesLs indicates a link to the set of nodes having associated target linksLt: expression --> expression-- Link targetsLt indicates that the node is to be included in the target set of all the names whichare prefixes of the name to which the expression should evaluateC.4. Presentation by feature[E is used to represent the value of the environment in which the feature occurs.]script ::= header node trailerheader::="Interscript/Interchange/1.0 "trailer::="EndScript"The semantics of the root node of a script are equivalent to the following general semanticsfor a node with the initial environment being the outermost, external environment X instead ofE:node ::= "{" item* "}"R = C = "{" R<"Sub" item*>([NULL | "OUTER" "=" E]) "}"B = locVal("OUTER", (B<"Sub" item*>([NULL | "OUTER" "=" E])))T = L = Ls = Lt = NILNodes have nested environments, and can have more global effects only through global (:=);fqX6Ff _p [q)+F YoI UX( S @Q NX*- LX-JE H EQ0) C8$ A >JX(<J 8sqX(6osq 2s2p2qX(0s0;0qD -Vs,-VqX(+"s*+"q I)4@ $>pX  qsqP xXqx xqxyjx xn dq#9 '+tq sq sqXsq  sqsqsqsqtqtqsq sqsqtqsqsqtqtqsq sqsqs qs qt  q8!X  >H]o@Appendices: Glossary , Arbitrary Choices, Formal Semantics40bindings. The items of a node are implicitly prefixed with the identifier Sub, which may bebound to any information intended to be common to all subnodes in a scope.item* ::= ""R = C = T = L = Ls = Lt = NILB = EThe empty sequence of items has no value and no effect; this is the basis for the followingrecursive definition.item* ::= item1 item*R = R(E) R(B(E))B = B(B(E)) For F in {C, T, L, Ls, Lt}:F = F FIn general, the value of a sequence of items is just the sequence of item values; bindingitems affect the environment of items to their right; NIL does not change the length of a resultsequence.term ::= primary op termop ::= "+" | "-" | "*" | "/"R = C = R(E) op R(E)B = ET = L = Ls = Lt = NILBoth the primary and the term must reduce to numbers; the arithmetic operators areevaluated right-to-left (a la APL, without precedence) and bind less tightly than application.primary ::= literalliteral ::= Boolean | integer | hexint | real | stringR = C = literalB = ET = L = Ls = Lt = NILThe basic contents of a document.invocation ::= idR = R(E)B = B(E) wherevalOf(id, E) = locVal(id, whereBound(id, E))-- Gets innermost valuewhereBound(id, E) = CASE-- Gets innermost bindinglocBinding(id, E) ~= NONE=> ElocBinding("OUTER", E) ~= NONE=>whereBound(id, locVal("OUTER", E))True=> NULLBoth attributes and definitions are looked up in the current environment; depending on thecurrent binding of id, this may produce values and/or bindings; if the binding's rhs wasquoted, the expression is evaluated at the point of invocation.When an id is referred to and locBinding(id, E)=NONE, then the value is sought recursively inlocVal("OUTER"). The outermost environment, X, binds each id to the "universal" namewhich is the uppercase equivalent of id.fqX: f p_ R p]KJ gZ sqXXsqsqsqsqsXXqsXXqtVsqs gTyq"9 gR gOsqXsqsqMsqsqsqsqsqsqsqsqsqL{sqsqsqsqsqsq gJsqsqsqsqsJjJqsJjJqI sqsqsqsqsq gFD gE 6tq gC@ g@sqXs g>q= sqsqsqsqsqsqsqsqsq;sqs:sqsqs9w:qs9w:qt g7qN g67( g2sqXs g1Uqsqsqsqsqs/sqsqs.Msqs,sqsqs,<,qs,<,qt p*q! g'is qXs%sqsqsqsqsqsq$asqsqsqsqsqsq g"!Ysqsqsqsqsqs qsqsq,s qsqsqt%5q]Qs qsqsqt"qs]s qtqsqt'qIIs qsqsqtqsq]t gqG gsq;sq g? gsqs qsqsqtq# gsqtq sq sq g2%sqn >HZ Towards an Interchange Standard for Editable Documents41invocation ::= name "." idR = R(E))>(E)B = B(E))>(E)Qualified names are treated as "nested" environments.universal ::= ucIDR = C = ucIDB = ET = L = Ls = Lt = NILUppercase-only identifiers are presumed to be directly meaningful and are not looked up inthe environment.application ::= invocation "[" item* "]"R = apply(invocation, R(E), E)B = E whereapply(invocation, value*, E) = CASE R(E) OF"EQUAL"=> value1 = value2"GREATER"=> value1 > value2. . ."SUBSCRIPT"=> value1[value2]-- value1: sequence, value2: int"CONTENTS"=> "(" C ")""TAGS"=> "(" T ")""LINKS"=> "(" L ")""SOURCES"=> "(" Ls ")""TARGETS"=> "(" Lt ")" ELSE => R([[NULL | "OUTER" "=" E] | "Value" "=" value*])inner("{" value* "}") = value*If the invocation does not evaluate to one of the standard external function names, thecurrent environment is augmented with a binding of the value of the argument list to theidentifier Value, and the value is the result of the invocation in that environment; this allowsfunction definition within the language.selection ::= "(" term "|" item1* "|" item2* ")"R = if R(E) then R(E) else R(E)B = if R(E) then B(E) else B(E)The notation for selections (conditionals) is borrowed from Algol 68:( | | )This is consistent with our principles of using balanced brackets for compound constructionsand avoiding syntactically reserved words; the true part and false part may each contain anarbitrary number of items (including none). sequence ::= "(" item* ")"R = C = "(" R(E) ")"B = B(E)T = L = Ls = Lt = NILParentheses group a sequence of items as a single value; bindings in the sequence affectthe environment of items to the right in the containing node, but labels are disallowed.Parentheses may also be used to override the right-to-left evaluation of arithmetic operators;an operand sequence must reduce to a single numeric value. ;fqX6Ff _s qsqs ]sqsqsqsqsqsqsq \sqsqsqsqsqsqsq Y5 VsqXs U*sqsqs Ssqs R"sqsqsQR"qsQR"qt OqW N# Js qXs qsq Issqsqs qsqsqsqsq Gsqs Fkq Dsqs qsqsq Cctqsqs qsqtAqtqnsqs@[qtqnsqs>q=Stqsqsq&P;tqsqsq:Ktqsqsq8tqnsqsq7Btqns67Bqsqsq5Utqns45Uqsq 3gtqsqs qtqtqsqsq 1sqsqsq /s qF -Q , sq> *N( '#sqXsqsqsq %sqsqsqsqsqsqsqsqsqsq $sqsqsqsqsqsqsqsqsqsq !E sqsqs q Q)3 /sqs q , sqXsq  sqsqsqsqsq sqsqsqsq sqsqswqswqt qA =  ;,2  p;N  )>\Appendices: Glossary , Arbitrary Choices, Formal Semantics42binding ::= name "_" rhsR = NILB = localBind(name, R(E), E) wherelocalBind(id, value, E) = [E | id _ value]localBind(id "." name, value, E) = [E | id _ localBind(name, value, valOf(id, E))]This adds a single binding to E; bindings have no other "side effects" and no value.binding ::= universal ":=" rhsbinding ::= name ":=" rhsR = NILB = globalBind(name, R(E), E)whereglobalBind(name, value, E) = iflocVal("OUTER", E)=NIL then localBind(name, value, E)else [E | "OUTER" _ globalBind(name, value, locVal("OUTER", E))]Each environment, E, initially contains only its "inherited" environment (bound to OUTER).Most bindings take place directly in E. To allow for "global" bindings, the value of aglobalBind(name, R(E), E) will change E by rebinding id in the outermost environment X(reached in the semantics by following the OUTER path from E until the outermost one isreached; if we started in a nodal environment, this will be X).Note that a global binding to some variable b does not guarantee that using b in a rhs context will result inaccessing the global b because a local binding to b may intervene.Note that in a context such as [ | a := 7], the effect of the above semantics is the same as [ | a _ 7].binding ::= name mode op term= This is just a convenient piece of syntactic sugar for the common case of updating abinding.rhs ::= "'" item* "'"R = item*If the rhs of a binding is surrounded by single quotes, it will be evaluated in theenvironments where the name is invoked, rather than the environment in which the binding ismade.rhs ::= "[|" binding* "]"R = [B([NULL | "OUTER" "=" E]) | "OUTER" "=" NULL]This creates a new environment value that may be used much like a record.rhs ::= "[" invocation "|" binding* "]"R =[B([R(E) | "OUTER" "=" E]) | "OUTER" "=" NULL]This creates a new environment value that is an extension of an existing one.tag ::= universal "$"R = R(E)B = B(E)T = universalC = L = Ls = Lt = NILThis gives the containing node the property denoted by the universal and also invokes thefqX: f g_sqsqsqs]sqt\sqsqsqsqsqsqsq gZ]Ysqsqsqsqsqsqsq]Wsqsqsqsqsqsqsqsqsqsqsqsqsq gUMsq4 gR"sXqs qs gPqsqsqsqsOsqtMsqs qsqsqsqsqsqL]Js qsqsqsqI sqtqsqtqsqs qGsqtqs qsqsqsqtqsq gEQsq@tq gC%sq/ gAs qsqsqsqsqsq sq sq sq t g?q*tq sq g>&]oTowards an Interchange Standard for Editable Documents43universal in the outermost environment (if it is not bound there, NIL will be produced, whichcontributes nothing to R).link ::= "LINKS" idR = "LINKS" idL = idB = EC = T = Ls = Lt = NILThis defines the scope of the set of links whose "main" component is id.A label N: on a node makes that node a "target" of the link N (and its prefixes); a reference^N makes it a "source." The "main" identifier of a link must be declared (using LINKS id) atthe root of a subtree containing all its sources and targets. The link represents a set ofdirected arcs, one from each of its sources to each of its targets. Multiple target labelsmake a node the target of multiple links. A target label that appears only on a single nodeplaces it in a singleton set, i.e., identifies it uniquely.link ::= "^" nameR = "^" nameLs = nameB = EC = T = L = Lt = NILThis identifies the containing node as a "source" of the link name.link ::= name ":"R = name ":"Lt = prefixes(name)B = EC = T = L = Ls = NILwhereprefixes(id) = idprefixes(name "." id) = name "." id prefixes(name)This identifies the containing node as a "target" of each of the links that is a prefix of name.C.5. DiscussionEach script is evaluated in the context of an initial environment, X, which can containattributes global to all scripts, attributes that specify values for system-specific identifiers,and in which all global bindings are made.Each environment, E, initially contains only its "inherited" environment (bound to the OUTER).Most bindings take place directly in E. To allow for more persistent bindings, the value of abind(id, ":=", val, E) will change E by rebinding id in X. For the root node of a script, OUTER= X.If the right-hand side of a binding is surrounded by single quotes, it will be evaluated in theenvironments where the name is invoked, rather than the environment in which the binding ismade.When an id is referred to and locBinding(id, E)=NONE, then the value is sought recursively in;fqX6Ff _Btq ]Ksq Z sqXvqs Xsqvqs Wsqs Usqs TsqsqsSTqsSTqt QqEsq Osq/sq MsqFvsq LF JGT H|3) F; CsqXs Bsqs @~s?@~qs >sqs = sqsqsqs<= qt :q9sq 7sqXsq 6(sqsq 4s44qsqsq 2sqs 12sqsqsqs012qt /Dq-Vsqsqs+qsqsqsqsqsqsq )Ksq $pX !6qCsq k10 * /sq'tq d%sq sqsqsqsq sq sqsq"t qsq ]I 7$    Vsqs qsqsqtq#  >]Appendices: Glossary , Arbitrary Choices, Formal Semantics44locVal("OUTER"). The X environment binds each id to the "universal" name which is itsuppercase equivalent (e.g., the universal for iDentiFieR is IDENTIFIER).Nodes are delimited by brackets. The contents of each node are implicitly prefixed by Sub,which will generally be bound in the containing environment to a quoted expressionperforming some bindings, and perhaps supplying some labels (tags and links).Parentheses are used to delimit sequence values. Square brackets are used to delimit theargument list of an operator application and to denote environment constructors, whichbehave much like records.Expressions involving the four infix ops (+, -, *, /) are evaluated right-to-left (a la APL);since we expect expressions to be short, we have not imposed precedence rules.The notation for selections (conditionals) is borrowed from Algol 68:( | | )This is consistent with our principles of using balanced brackets for compound constructionsand avoiding syntactically reserved words; the true part and false part may each contain anarbitrary number of items (including none). A label N: on a node makes that node a "target" of the link N (and its prefixes); a reference^N makes it a "source." The "main" identifier of a link must be declared (using LINKS id) atthe root of a subtree containing all its sources and targets. The link represents a set ofdirected arcs, one from each of its sources to each of its targets. Multiple target labelsmake a node the target of multiple links. A target label that appears only on a single nodeplaces it in a singleton set, i.e., identifies it uniquely.C.6. Grammatical feature X Semantic function matrixLEGEND:- Semantic function produces NIL or E or does not apply.+ Non-trivial semantic equation.=For R: passes value unchanged; for C: value same as R.FEATURES:FUNCTIONS:RCBTLLsLtterm ::= primary op term+=-----primary ::= literal==- -- --invocation ::= id+-+----invocation ::= name "." id+-+----universal ::= name "$"==- -- --application ::= invocation "[" item* "]"+-- -- --selection ::= "(" term "|" item1* "|" item2* ")"+-+-- --node ::= "{" item* "}"+ =+-- --sequence ::= "(" ( value | binding )* ")"+ =+-- --item* ::= item1 item*+++++++binding ::= name mode rhs- -+-- --rhs ::= "'" item* "'"+-- -- --rhs ::= "[|" binding* "]"+-- -- --fqX: f g_sqtqsq? g]KH gY.- gX> gVDM gRY gQ,* gO= gKC gJN gFED' gB)3 gA.K g?d, g; S g:' Fvq g8]F g6T g43) g2; g.pX4 p*q p(tq p' p%58 g!-P -P ?/2L47B9; ? gQX-P/2L47B9; g-P/2L47B9; g-P/2L47B9; g[-P/2L47B9; g-P/2L47B9; g (-P/2L47B9; ge0-P/2L47B9; g-P2L47B9; g)-P2L47B9; go-P/2L47B9; g -P/2L47B9; g -P/2L47B9; g y-P/2L47B9; X 2=]Towards an Interchange Standard for Editable Documents45rhs ::= "[" invocation "|" binding* "]"+-- -- --tag ::= invocation "%"+-- +- --link ::= "LINKS" id=-- -+ --link ::= "^" name=-- -- +-link ::= name ":"=-- -- -+- Semantic function produces NIL or E or does not apply.+ Non-trivial semantic equation.=For R: passes value unchanged; for C: value same as R.;fqX6Ff _'+.q03g58]:( ]n+.q03g58]:( [ vq+.q03g58]:( Z +.q03g58]:( Xx+.q03g58]:( Utq S< Qq8h O`0~ HELVETICA HELVETICA HELVETICA  HELVETICA  HELVETICA LOGO HELVETICA HELVETICA HELVETICA  HELVETICA  HELVETICA  HELVETICAMATH  HELVETICA  TIMESROMAN  HELVETICA HIPPO  TIMESROMAN TIMESROMAN HELVETICA HELVETICA  HELVETICA  HELVETICA  TIMESROMAN TIMESROMAN  HELVETICA HELVETICA HELVETICA  HELVETICA  TIMESROMAN MATH  TIMESROMAN HELVETICA  HELVETICA HELVETICA  HELVETICA  HELVETICA HELVETICA  HELVETICA HELVETICA TIMESROMAN HELVETICA  TIMESROMAN  MATH N8 !)0 9X<DLT[cjqx  D k!)0o6C=?bi~dCd>z􌻊bbb~bºazN~d>bbj/D0Binterscript-aug82.pressmitchell31-Aug-82 17:08:25 PDT: