Note:  authors' notes and open questions are given in the form <<description of the issue>>.
Filed as    [Indigo]<Interscript>DraftStd>Interscript-Standard.tioga, .press
Last edited    By Mitchell on January 3, 1984 4:07 pm
Remove This Cover Sheet
Interscript

A Proposal for a
Standard for the Interchange of Editable Documents

January 1984


XEROX            Robert M. Ayers
                Xerox Corporation
                Office Systems Division
                Systems Development Department
                3450 Hillview Drive
                Palo Alto, California 94304
               
                James J. Horning
                Butler W. Lampson
                James G. Mitchell
                Xerox Palo Alto Research Center
                3333 Coyote Hill Road
                Palo Alto, California 94304

Status of Document: Private Contribution

Purpose and Overview:
    Interscript defines a digital representation of editable documents for interchange among editing systems. A script is the 
    representation of a document in the Interscript format;  it can be transmitted from one editor to another over a network, or 
    can be stored for later editing. A script is not limited to any particular editor: if a script contains editable information 
    some of which is not understandable by a particular editor, it is still possible to edit the parts of the document understood 
    by that editor without losing or invalidating the parts it does not understand.
    Because this report is intended as a nascent standard, it necessarily has a great deal of precision, and we have employed various 
    formal notations to achieve this.  Such formal notation requires careful in-depth reading, but it has the advantage of being 
    able to give precise answers to questions about the standard.  To help with reading it, commentary has been included in a 
    number of places in fine print, like this.  However, the small print should be viewed as hints about the standard: in any 
    conflict between the small print and the normally sized text, the normally sized text is to be considered the correct version.  
      
    The more casual reader may find the Introduction and Glossary helpful in understanding some of the concepts of Interscript.

Contents
1. Introduction
2. The Language Basis: Syntax and Semantics
3. Tag Declaration, Node Invariants, and Safety Rules for Editors
4. Standard Document Constructs
5. Layout
6. Other Constructs
Appendix A:  Glossary
Appendix B:  Example of Script Evaluation
Appendix P: Paragraph% ReducesTo
Appendix Q: Line% ReducesTo

Interscript 83 Xerox System Integration Standard
                                                                                                                                    
      
1
                                                                                                                           
Introduction
                                                                                                                                    
   
1
1.  Introduction

    Interscript provides a means of representing editable documents.  This representation is independent of any particular editor and 
    can therefore be used to interchange documents among editors.
1.1. Rationale for an interchange standard
    As office systems proliferate, being able to interchange documents among different editing systems is becoming more and more 
    important.  Customers need document compatibility to avoid being trapped in evolutionary cul-de-sacs and having to pay the 
    awful price of converting documents from one product's format to another's.
    Typically, an editing program uses a private, highly-encoded representation when operating on a document to enable it to provide 
    good performance. Generally, this means that different editors use different, incompatible private formats, and a user can 
    conveniently edit a document only with the editor used to create it. This problem can be solved by providing programs to 
    convert between one editor's private (or file) format and another's. However, a set of different editors with N different 
    document representations requires N(N-1) conversion routines to be able to convert directly from each format to every other.
    This N(N-1) problem can be reduced to 2(N-1) by noticing that we could write N-1 conversion routines to go from F1 (format for 
    editor1) to F2,. . .,FN, and another N-1 routines to convert from F2,. . .,FN to F1. Except when converting from or to F1, this 
    scheme requires two conversions to go from Fi to Fj (j    is the critical issue, however, since the capabilities of that editor will determine how general a class of documents can be 
    interchanged among the editors.
    This presents a truly difficult problem in the case that there is no single functionally dominant editor1 in the set. If the 
    pivotal editor1 doesn't incorporate all of the structures, formats, and content types used by editor2,. . .,editorN, then it 
    will not be possible to faithfully convert documents containing them. Even if there were a single, functionally dominant 
    editor, it would place an upper bound on the functionality of all future compatible editors.
    Since there are no actual candidates for a totally dominant editor, this standard has been developed by examining, in general, what 
    information editors need and how that information can be organized to represent general documents.  It provides an external 
    representation that is capable of conveying the content, form, and structure of editable documents. That external 
    representation has only one purpose: to enable the interchange of documents among different editors.  It must be easy to 
    convert between real editors' formats and this interchange encoding.
    When represented by this interchange encoding, we call a document a script and reserve the term document for the representation 
    that an editing system uses to enable editing it.
    Using a standard interchange encoding has the additional advantage that much of the input and output conversion algorithms will be 
    common to all conforming editors. For example, when a new version of an existing editor is released, the only differences in 
    the new version's conversion routines will be in the areas in which its internal document format has changed from its previous 
    form; this represents a significant saving of programming.
1.2. Properties that any interchange standard must have
    An interchange encoding for editable documents must satisfy a number of constraints.  Among these are the following:
1.2.1. Encoding efficiency
    Since editable documents may be stored as scripts, may be transmitted over a network, and must certainly be processed to convert 
    them to various editors' private formats, it is important that the encoding be space-efficient.
    Similarly, the cost in time of converting between interchange encoding and private formats must be reasonably low, since it will 
    have a significant effect on how useful the interchange standard is.
1.2.2. Open-ended representation
    Scripts must be capable of describing virtually all editable documents, including those containing formatted text, synthetic 
    graphics, scanned images, animated images, etc., and mixtures of these various modes. Nor may the standard foreclose future 
    options for documents that exploit additional media (e.g., audio) or require rich structures (e.g., VLSI circuit diagrams, 
    database views). Thus, a standard must be capable of incremental extension and any extension must have the same guarantees and 
    be able to employ the same mechanisms as the most basic parts of the standard.
    For the same reasons, the standard must not be tied to particular hardware or to a file format since documents will be stored and 
    transmitted using a variety of media.
1.2.3. Document content and form
    The complete description of a component of a document usually requires more than a list of its explicit contents; e.g., paragraphs 
    have margins, leading between lines, default fonts, etc.  Scripts must record the association between attributes (e.g., 
    margins) and pieces of content.
    Both the contents and attributes of typical documents require a rich value space containing scalar numbers, strings, vectors, and 
    record-like constructs in order to describe items as varied as distances, text, coefficients of curves, graphical constraints, 
    digital audio, scanned images, transistors, etc. 
1.2.4. Document structure
    Many documents have hierarchical structure; e.g., a book is made of chapters containing sections, each of which is a sequence of 
    paragraphs; a figure is embedded in a frame on a page and in turn contains a textual caption and imbedded graphics; and the 
    description of an integrated circuit has levels corresponding to modular or repeated subcircuits. This standard exploits such 
    structure, without imposing any particular hierarchy on all documents. 
    Hierarchy is not sufficient, however. Parts of documents must often be related in other ways; e.g., graphics components must often 
    be related geometrically, which may defy hierarchical structuring, and it must be possible to indicate a reference from some 
    part of a document to a figure, footnote, or section in way a that cuts across the dominant hierarchy of the document.
    Documents often contain structure in the form of indirection. For instance, a set of paragraphs may all have a common "style," 
    which must be referred to indirectly so that changing the style alone is sufficient to change the characteristics of all the 
    paragraphs using it. Or a document may be incorporated "by reference" as a part of more than one document and may need to 
    "inherit" many of its properties from the document into which it is being incorporated at a given time. 
1.2.5. Transcription fidelity
    It must be possible to convert any document from any editor's private format to a script and reconvert it back to the same editor's 
    private format with no observable effect on the document's content, form, or structure. This characteristic is called 
    transcription fidelity, and is a sine qua non for an interchange encoding; if it is not possible to accomplish this, the 
    interchange encoding or the conversion routines (or both) must be defective.  It must, of course, also be possible to test that 
    an editor does transcribe scripts faithfully, which in turn requires that it be possible to test if two scripts are equivalent 
    (section 2.3.4).
1.2.6. Script comprehension
    Even complicated documents have simple pieces. A simple editor should be able to display parts of documents that it is capable of 
    displaying, even in the presence of parts that it cannot.  More precisely, an editor must, in the course of internalizing a 
    script (converting it from a script to its private, editable format), be able to discover all the information necessary to 
    recognize and to display the parts that it understands. This must work despite the fact that different editors may well use 
    different data structures to represent the content, form, and structure of a document.
    At a minimum, this requires that a script contain information by which an editor can easily determine whether or not it understands 
    a component well enough to edit it, and that it be able to interpret the effect that components which it does not understand 
    have on the ones it does. For example, if an editor does not understand figures, it might still be possible for it to display 
    their embedded textual captions correctly, even though a figure might well dictate some of its caption's content or attributes 
    such as margins, font, etc. 
    This constraint requires that an interchange encoding must have a simple syntax and semantics that can be interpreted readily, even 
    by low-capability editors.
1.2.7. Regeneration
    Processing a script to internalize it correctly is only half the problem. It is equally important that an editor, in externalizing 
    a script from its private internal format be able to regenerate the content, form, and structure carried by the script from 
    which the document originally came.  In particular, when regenerating a script from an edited document, it should be possible 
    to retain the structure in parts of the original script that were not affected by editing operations. For example, an editor 
    that understands text but not figures should be able to edit the text in a document (although editing a caption may be unsafe 
    without understanding figures) while faithfully retaining and then regenerating the figures when externalizing it.
    This problem is much less severe when an editor is transcribing a document that it "understands" completely, e.g., because the 
    entire document was generated using that same editor.
1.3. What the Interscript standard does not do
    There are a number of issues that the Interscript standard specifically does not discuss. Each of these issues is important in its 
    own right, but is separable from the design of an interchange representation
1.3.1. Interscript is not a file format
    This standard is not concerned with how scripts are held in files on various media (floppy disks, hard disks, tapes, etc.), or with 
    how they are transmitted over communications media (local area network, telephone lines, etc.). 
1.3.2. Interscript is not a standard for editing
    A script is not intended as a directly editable representation.  It is not part of its function to make editing of various 
    constructs easier, more efficient, or more compact: that is the purview of editors and their associated private document 
    formats. A script is intended to be internalized before being edited. This might be done by the editor, by a utility program on 
    the editing workstation, or by a completely separate service.
1.3.3. Combining documents is not an interchange function
    This exclusion is really a corollary of the statement, "A script is not intended as a directly editable representation." In 
    general, it is no easier to "glue" two arbitrary documents together than it is to edit them.
1.3.4. Interscript does not overlap with other standards
    There are a number of standards issues that are closely related to the representation of editable documents, but which are not part 
    of the Interscript standard because they are also closely related to other standards. For example, the issues of specifying 
    encodings for characters in documents, or how fonts should be named or described are not part of this work.
1.4. Concepts and Guiding Principles
1.4.1. Layers
    The Interscript standard is presented in layers:
    Layer 0 defines the syntax of the base language for scripts; parsing reveals the dominant structure of the documents they represent 
    (sections 2.1-2.2).
    Layer 1 defines the semantics of the base language, particularly the treatment of bindings and environments (section 2.3, chapter 
    3).
    Layer 2 defines the semantics of properties and attributes that are expected to have a uniform interpretation across all editors 
    (chapters 4-5).
1.4.2. Externalization and Internalization
    A script represents a document in the Interscript format.  Its sole purpose is to enable the interchange of documents among editors 
    in a manner that is independent of any one editor.
    A script is not the editable form of a document.  The editable form is created by an editor by internalizing a script according to 
    the rules (semantics) of Interscript.  The reverse operation of converting a document in an editor's internal, editable format 
    to a valid script is called externalization.
    It is important that any document prepared by any editor can be externalized as a script that will then be (re)internalized by the 
    editor without "loss of information". Ease of internalization requires that the Interscript base language contain only 
    relatively few (and simple) constructs. This apparent paradox has been resolved by including within the base language a simple, 
    yet powerful, mechanism for abbreviation and extension.
    A script may be considered to be a "program" that is "compiled" to convert the script to the private representation of a particular 
    editor, ready for further editing. The Interscript language has been designed so that internalizing scripts into typical 
    editors' representations can be performed in a single pass over the script by maintaining a few simple data structures.
1.4.3.  Content, Form, Value, and Structure
    Most editors deal with both the content of a document (or piece of a document), and its form. The former is thought of as "what" is 
    in the document, the latter as "how" it is to be viewed; e.g., "ABC" has a sequence of character codes as its contents; its 
    format may include font and position information.  Interscript maintains this distinction.
    The distinction between the value and the structure of both content and form within a document is also important.  When viewing a 
    document, only the value is of concern, but the structure that leads to that value may be essential to convenient editing.  An 
    example of structure in content is the grouping of text into paragraphs.  An example of structure in form is associating a 
    named "style" with a paragraph.
    Content: may be represented by structures built from character strings, numbers, Booleans, identifiers, and nodes, which are 
    structured objects containing simpler ones.
    Form: Interscript provides for open-ended sets of properties and attributes. Properties are associated with content by means of 
    tags.  Attributes are bindings between names and values that apply over some scope.  The way the contents of a document are to 
    be "understood" is determined by its properties; Interscript makes it straightforward to determine what these properties are 
    without having to understand them.
    Structure: Most editors structure the content of a document somehowinto paragraphs, sections, chapters; or lines, pages, 
    signatures, for example. This assists in obtaining private efficiency, but, more importantly, provides a conceptual structure 
    for the user.
    The most important, and most frequent, structuring mechanism between values is logical adjacency (sequentiality), which is 
    represented by simply putting them one after another in the script.
    Most editors that structure contents have a "dominant" hierarchy that maps well into trees whose arcs are implicitly labelled by 
    order. (Different editors use these trees to represent different hierarchies). Interscript provides a simple linear notation 
    for such trees, delimiting node values by braces ("{" and "}"). If an editor maintains multiple hierarchies, the dominant one 
    is the one transcribed into the primary tree structure and used to control the inheritance of attributes.
    Structures recorded for form use explicit indirection by means of names.  Interscript allows expressions composed of literals, 
    identifiers, and operators, and permits the use of identifiers to represent expressions.

2. The Language Basis: Syntax and Semantics
2
                                                               The Language Basis: Syntax and Semantics
                                                                                                                                    
   
2
    This chapter defines the Interscript Base Language.  Two versions of its syntax, a concrete one and an abstract one, are supplied.  
    The concrete grammar defines the publication encoding for Interscript, which is solely intended for communicating Interscript 
    concepts among people and is used for all examples of Interscript in this standard (chapter 7 defines the actual encoding to be 
    used for representing scripts for editing systems rather than humans).
    Section 2.3 defines the semantics of the base language.  These semantics have two primary functions: (1) to provide a rigorous 
    definition for equivalence of scripts, and (2) to show conceptually what information in any script an Interscript 
    implementation must represent in order to be able to internalize and externalize a script correctly.
2.1. Grammar
Our notation is basically BNF with terminals quoted and parentheses used for grouping.
    script    ::=   header node trailer
    header    ::=   "INTERSCRIPT/INTERCHANGE/1.0 "
    trailer    ::=   "ENDSCRIPT"
    node    ::=   "{" items "}"
    items    ::=   empty | items item
    item    ::=   tag | indirection | binding | sBinding | term | openedNode | scope
    tag    ::=   primary "$"
    openedNode    ::=   (term | name "%") "|"
    scope    ::=   "[" items "]"
    binding    ::=   name "_" term
    sBinding    ::=   name "%_" ( term | indirection |  "'" term "'" )
    term    ::=   primary | term op primary
    op    ::=   "+" | "" | "*" | "/" | "!" | "LT" | "EQ"
    primary    ::=   literal | invocation | node | "(" term ")"
    literal    ::=   name | number | string
    name    ::=   id | name "." id
    invocation    ::=   primary  "^"
    indirection    ::=   name "%"
    This small grammar defines a representation language for scripts that is small enough to admit of a precise denotational semantics 
    (section 2.3), yet powerful enough to include facilities for strong typing and data-type extensibility.  
    This surface syntax is intended only for use as a means of communications about Interscript.  It is a publication encoding and is 
    not necessarily intended as the actual encoding to be used for interchanging documents among editing systems.  Any such 
    encoding should pay attention both to the structure of this language and to the requirements for an easily decoded, robust, and 
    space-efficient representation.
2.2. Abstract Grammar for Interscript
    To associate a precise semantics with this language, it is easier to work from an abstract syntax for it, ignoring surface 
    punctuation and treating the parsed form of some external encoding of a script.  The following grammar serves this purpose as 
    well as indicating the reason for each of the parts of the language (in small print, like this).  However, the small print 
    should be viewed as hints about the semantics; in any conflict between the small print and the actual semantics, the semantics 
    is to be considered the correct version:
    script    ::=   node
    An entire script is represented as a single high level node with subnodes to carry the various parts of the logical and layout 
    structures.
    node    ::=   items
    items    ::=   empty | items item
    item    ::=   tag | binding | sBinding | term | indirection | openedNode | scope
    A node is the primary structured data type of the language and is used to represent both logical and layout structures.  The items 
    that make up a node may be tags, values, bindings, and expressions giving rise to tags, values, and bindings.  A node with 
    values but no bindings behaves like a vector; a node with bindings but no values behaves like a record; values and bindings may 
    be intermixed to capture both the content of a piece of a document and the bindings associated with that content (such as what 
    font a character is to be displayed in).
    tag    ::=   primary
    A node can have zero or more tags.  Each tag is associated with an invariant that all nodes with that tag must satisfy.  This 
    invariant includes what attributes (bindings) are relevant to the node, what types of contents are allowed for the node, and 
    any relations that must hold among the attributes of the node.  For example, a TEXT node is only allowed to have character 
    strings or PSEUDOCHAR nodes as contents, and font is a relevant attribute of a TEXT node.  An editor can tell whether or not it 
    has code to deal explicitly with a node simply by looking at the node's tags.   
    binding    ::=   name term
    A binding associates a value (the result of elaborating the term) with a name.  As bindings are encountered during internalization, 
    they are appended to the current environment.  Whenever the value associated with a name is needed, the environment is scanned 
    from most-recent to least-recent for a binding to that name (of course, this is a logical description, not necessarily how one 
    would implement environments).
    sBinding    ::=   name sRhs
    A structured binding is just like a normal binding except that the binding must be kept with the node in which it occurs, even if 
    it is not a relevant attribute of (a tag of) that node.  Only names that are structurally bound can be used in an indirection 
    (see below) or can be opened (see openedNode below).  An intended use of sBindings is for associating style information with a 
    name so that it can be applied to nodes while still maintaining the inherent indirection that would allow the style to be 
    changed and have the effect of the change propagate to all the nodes using it.
    sRhs    ::=   term | indirection | quoted
    A structured binding can associate a single value, another name, or a quoted expression with a name.  to get the effect of 
    structurally binding a set of items to a name, one can bind a node value to the name and then access the items using the 
    openedNode mechanism; e.g., list %_ {1 1 2 3 5 8}  list%| will result in the sequence 1 1 2 3 5 8 becoming items at the same 
    level as the list%|. 
    quoted    ::=   term
    A quoted expression must be a syntactically correct expression (this is not a macro facility!).  It will be evaluated in the 
    environment that obtains when used in an indirection (see below). 
    term    ::=   primary | term op primary
    op    ::=   ADD | SUB | MUL | DIV | SUBSCRIPT | LT | EQ
    primary    ::=   literal | invocation | node | term
    Interscript expressions are primitive with a minimal spanning set of operators.  Since scripts are expected to be written and read 
    by editing systems rather than by people, this is adequate.  An invocation replaces an occurrence of a name by its value.  A 
    node is a value like any other.
    literal    ::=   name | number | string
    A name standing by itself is atomic: it simply stands for itself, nothing more (an invocation is used if the name is to be 
    evaluated to produce a value).  Numbers may be either Fortran-like reals or integers.  Strings are characters in the ISO 646 
    subset bracketted by the double quote character (").
    name    ::=   id | name id
    A name is either a simple identifier (e.g., a, b, simpleId) or it is a qualified name (e.g., a.position.x, font.face.style).  If 
    the name is to be evaluated (for an invocation, an indirection, or an openedNode), then it is evaluated from left to right.  
    All the identifiers except the last in a name must evaluate to node values, which will then be treated as environments in which 
    to continue the name evaluation process.  For example, in a.position.x, a must be bound to a node value and must contain a 
    binding for position, which must also be a node value containing in turn a binding for x.
    invocation    ::=   primary
    When a name is invoked, it is evaluated as described under the name rule above, and the resultant value is used in place of the 
    invocation.
    indirection    ::=   name
    An indirection behaves just like an invocation for the purposes of evaluation.  In addition, the fact that the resulting value came 
    from an indirection must be remembered in case of changes to the binding for the name or for correct externalization of the 
    document as a script.
    openedNode    ::=   primary | name
    A node value can be bound to a name and then later "opened" in some context to place the items of the node into the environment in 
    which it is opened.  This can be used to introduce default values for tags, values, or bindings anywhere in a script.  If the 
    form name%| is used instead of name^|, then the semantics demand that the name of the node opened be maintained as well as the 
    result of opening it.  Such a structured open can be used to associate styles with parts of a document while guaranteeing that 
    any change to the style can be reflected in all the places that the style is used.
    scope    ::=   items
    A list of items can be put in scope brackets "[ ... ]" to ensure that any computation in the scope cannot affect the surrounding 
    environment except by maintaining any bindings that are structural (see sBinding).  Any contents that result from evaluating 
    the scope are considered part of the surrounding environment (just as if they had not been surrounded by scope brackets).
2.3. Formal Semantics for Interscript
    The semantics for the base language are defined by a function, S, whose domain is the set of parse trees corresponding to the 
    abstract grammar of section 2.2.  S is written in a variant of the Mesa programming language [Mesa] which has had added to it 
    the ability to define and manipulate lists of values (provided all the elements of a given list have the same type in the Mesa 
    sense).
    A type describing a list of items of type T can be defined as "LIST OF T" (in these semantics, T will always take the form "REF T'" 
    for some type T').  A list element can be created by the operation
    CONS: PROC[first, rest: LIST OF T] RETURNS[LIST OF T]
where T may be any type.  Given a list L, the first element can be accessed as L.first, and the tail of the list as L.rest.
2.3.1.  Semantics organized according to the abstract grammar
The semantics of the base language are described by the function S: 
S: PROC[
where a NonTerminal represents a node of an abstract parse tree (not defined here).
Vs: TYPE=LIST OF V
    V: TYPE = REF VRec
Conceptually, a VRec is a triple consisting of a mark, m, some contents, c, and sometimes extra information, x, which is primarily 
for recording a name (e.g., for a binding).  To get the maximum benefit of type checking the definition of S without encumbering 
the definition with a lot of extraneous type coercions, a Vrec is defined to contain a set of fields, ccontent (cIn any given Vrec at most one of the c fields and at most one of the x fields will actually be used; e.g., a tuple representing the 
number 3.14 has the form VRec[m: num, cNum: 3.14], and a tuple representing the name a.b.c as an atom has the form VRec[m: atom, 
xName: CONS[c, CONS[b, CONS[a, NIL]]] ].  Here are the complete definitions of VRec and Mark:
VRec: TYPE = RECORD[m: Mark,
    cVs: Vs_NIL, cV: V_NIL, cNum: Number_0.0, cStr: STRING_NIL, cId: Id_NIL,
    cTerm: NonTerminal_NIL,
    xId: Id_NIL, xName: Name_NIL, xEnv: Env_NIL];
Mark: TYPE= {atom, num, string, node,    -- base values
tag, bind, scope, bindStruc, evalStruc, onodeStruc, quotedTerm, vOfQ, evalSentinel};
S is defined in terms of various subsidiary semantic functions, which divide naturally into two groups: those that require an 
environment as a parameter and those that do not.  The type signatures of the functions in the two groups are
(1) functions with an environment parameter:
    MkNode: PROC[
    MkBinding: PROC[
    Lookup: PROC[
    EvalName: PROC[
    RelBindings: PROC[
(2) functions independent of environment (dealing only with tuples or lists of tuples):
    Apply: PROC[op: Op, v1: V, v2: V] RETURNS [result: V]
    Bindings: PROC[v: V] RETURNS [Env]
    Contents: PROC[vs: Vs] RETURNS [Vs]
    Tags: PROC[vs: Vs] RETURNS [Vs --tag items only--]
    Items: PROC[v: V] RETURNS [Vs]
    MkScope: PROC[vs: Vs] RETURNS [Vs]
    S is written in a "distributed" manner in the table below.  It associates the parts of S with non-terminals of the abstract 
    grammar.  If S were written as a single function, it would look like a large case statement with one arm for each non-terminal 
    of the abstract grammar and the lines under the column heading S[
        to a node.  Some of the semantic rules pass a modified     environment can be augmented by one or more bindings, bind*, denoted by ConcatVs[bind*, 
    Only the semantics for "interesting" alternatives of the grammar are given below.  For any alternative lhs ::= rhs whose semantics 
    are not presented, its value semantics are S[
    When S maps a nonterminal of the abstract grammar into its tuple form, it is said to be elaborated. 

Lefthand Side  Alternative        S[
script    ::=    node        S[X, node]
The environment for the root node of an abstracted script is the external environment X.
node    ::=    items        {t: Vs=S[            MkNode[ConcatVs[Bindings[t], 
Make a node from the list of items using the environment obtained by appending bindings in the node with the current environment.

items    ::=    empty        empty
             | items item    {t: VS=S[                     ConcatVs[t, S[ConcatVs[Bindings[t], 
Recursively elaborate items and then elaborate the last item using the environment formed by appending the bindings in the 
preceding items to the current environment.  Thus bindings affect things to their right in a list of items. 
tag    ::=    primary        MkTag[
Elaborate the primary.  That should produce a (possibly qualified) name to use as the tag.
scope    ::=    items        MkScope[S[
openedNode    ::=    term        Items[S[             | name        MkV[m: onodeStruc,
                     cVs: Items[EvalName[
If a term (non-structural openedNode), then use the Items function to pull all the necessary elaborated items from the node value 
that the term must elaborate to.  If a structural openedNode, embed the elaborated items of the named node in a tuple marked as an 
onodeStruc in order to remember the name from which the items were obtained.
binding    ::=    name term    MkBinding[
Elaborate the term in the current environment. and make a (nonstructural) binding between the name and the resultant value.  See 
MkBinding for details, especially for how values are bound to qualified names. 
sBinding    ::=    name sRhs    MkBinding[
Elaborate the term in the current environment. and make a structural binding between the name and the resultant value.  See 
MkBinding for details, especially for how values are bound to qualified names.
quoted    ::=    term        MkV[m: quotedTerm, cTerm: term]
Simply make a quotedTerm tuple to hold the term for later elaboration.
term    ::=    term op primary    apply[op, S[
Elaborate the term, then the primary, and then apply the op to the result.
literal    ::=    name        MkV[m: atom, xName: S[NIL, name]]
             | number    MkV[m: num, cNum: number]
             | string        MkV[m: string, cStr: string]
A name as a literal is just made into an atom tuple.  A numeric literal is stored as a num tuple.  And, a string literal is stored 
as a string tuple.
name    ::=    id        MkAtom[id]
             | name id    CONS[MkAtom[id], S[NIL, name]]
A qualified name is mapped into a list with the most highly qualified identifier first and the head, unqualified identifier last.
invocation    ::=    primary        EvalName[
Elaborate the primary to obtain a name to invoke (stored as the xName component of the resultant tuple).  Then use EvalName to 
lookup and evaluate the name.
indirection    ::=    name        MkV[m: evalStruc, cV: EvalName[            xName: S[NIL, name]]
Use EvalName to lookup and evaluate the name and embed the resultant value in an evalStruc tuple along with the name from which it 
came.  
<<2.3.2. Definitions of semantics as a Mesa function>>
<<Groundrule: Once constructed, no values or lists of values are ever modified (except in Lookup).  Therefore, they can be handed 
around without regard for being changed in situ later.>>
S: PROC[
NonTerminal: TYPE = REF --ParseNode--;
Error: ERROR[ec: {BoundsFault, InvalidTag, NotSingleton, UnboundId, WrongType}] = CODE; 
<<*******VALUES*******>>
V: TYPE = REF VRec;
Number: TYPE = REAL;
VRec: TYPE = RECORD[m: Mark,
    cVs: Vs_NIL, cV: V_NIL, cNum: Number_0.0, cStr: STRING_NIL, cId: Id_NIL, -- Contents
    cTerm: NonTerminal_NIL, -- used for representing quoted terms
    xId: Id_NIL, xName: Name_NIL, xEnv: Env_NIL];    -- eXtra information
Mark: TYPE= {atom, num, string, node,    -- base values
tag, bind, scope, bindStruc, evalStruc, onodeStruc, quotedTerm, vOfQ, evalSentinel};
MarkAsAtom: ARRAY Mark OF ATOM = [$atom, $num, $string, $node, $tag, $bind, $scope, $bindStruc, $evalStruc, $onodeStruc, 
$quotedTerm, $vOfQ, $evalSentinel];
Vs, Env: TYPE=LIST OF V;    -- for an Env, all items have mark bind or bindStruc
X: Env _ NIL;    -- stand-in for the eXternal environment
MkV: PROC[m: Mark, cVs: Vs_NIL, cV: V_NIL, cNum: Number_0.0, cStr: STRING_NIL,
cId: Id_NIL, cTerm: NonTerminal_NIL, xId: Id_NIL, xName: Name_NIL, xEnv: Env_NIL] RETURNS[V] = {    -- The basic function for 
making a tuple
    RETURN[NEW[VRec _ [m: m, cVs: cVs, cV: cV, cNum: cNum, cStr: cStr, cId: cId, cTerm: cTerm, xId: xId, xName: xName, xEnv: xEnv]]]};
True: V=MkV[m: num, cNum: 1.0];  False: V=MkV[m: num, cNum: 0.0];
TRUE and FALSE as tuples.
MkVs: PROC[v1, v2, v3: V_NIL] RETURNS [vs: Vs] = {
Make a list of Vs from individual Vs.
    vs_NIL; IF v3#NIL THEN vs_CONS[v3, vs];
    IF v2#NIL THEN vs_CONS[v2, vs]; IF v1#NIL THEN vs_CONS[v1, vs] };
ConcatVs: PROC[v1, v2, v3: Vs_NIL] RETURNS [vs: Vs] = {
Concatenate a number of individual lists into one list (in order v1, v2, v3).
AppendToT: PROC[list: Vs] = {
UNTIL list=NIL DO  t.rest _ CONS[list.first, NIL]; t _ t.rest;  list _ list.rest  ENDLOOP};
t: Vs _ (vs_CONS[NIL, NIL]);
AppendToT[v1]; AppendToT[v2]; AppendToT[v3]};
NthItem: PROC[vs: Vs, n: INT] RETURNS [V] = {    -- indexing into a list
    IF vs=NIL OR n<0 THEN Error[BoundsFault];
    RETURN[ IF n=0 THEN vs.first ELSE NthItem[vs.rest, n-1] ]};
Name, Ids: TYPE=LIST OF Id;  Id: TYPE = ATOM;
<<A Name is never empty; it always comes from the parser>>
MkName: PROC[a: ATOM, hd: Name_NIL] RETURNS[Name] = {RETURN[CONS[a, hd]]};
EqName: PROC[n1, n2: Name] RETURNS[BOOL] = {
    UNTIL n1=NIL OR n2=NIL DO
        IF n1.first#n2.first THEN RETURN[FALSE];
        n1_n1.rest; n2_n2.rest;
        ENDLOOP;
    RETURN[n1=n2]};
<<*******REAL WORK*******>>
MkScope: PROC[vs: Vs] RETURNS [Vs] = {RETURN[
Unless a scope contains some structural items, its contents simply become contents of the surrounding node.
    (IF HasStruc[vs] THEN MkVs[MkV[m: scope, cVs: Contents[vs]]] ELSE Contents[vs])]};
HasStruc: PROC[vs: Vs] RETURNS [BOOL] = {
Tests if a list of Vs has any structural items (binds, indirections, or structurally opened node).
    One: PROC[v: V] RETURNS [BOOL] = {
        RETURN[v.m=bindStruc OR v.m=evalStruc OR v.m=onodeStruc OR v.m=quotedTerm]};
    RETURN[ IF vs=NIL THEN FALSE ELSE (One[vs.first] OR HasStruc[vs.rest]) ]};
MkTag: PROC[
For every valid tag in a script there must be a tag definition (section 3.1.1).  That definition is given as a (structural) binding 
of a TAG$ node to the tag's name.  capture that definition in the tuple representing the tag along with its name (remember these 
are value semantics, not an implementation: a real implementation would avoid copying the definition and point to it somehow).
    tagDef: --node--V = EvalName[
    IF NOT HasTag[tagDef.cVs, $TAG] THEN Error[InvalidTag];
    RETURN[MkV[m: tag, cV: tagDef, xName: n]]};
MkNode: PROC[
The canonical form for a node is a list of all its tags followed by its contents, followed by a sequence of the relevant bindings 
for each of the tags.
    RETURN[MkV[node, ConcatVs[Tags[vs], Contents[vs], RelBindings[
The semantic procedures MkBinding, Lookup, EvalName, and Eval are a closely related set and provide the central definition of how 
names are used in Interscript and how bindings are handled.
MkBinding: PROC[RETURNS [--bind|bindStruc-- V] = {
The only difference between a _ and a %_ is that the tuple is marked as such.
    IF n.rest=NIL THEN RETURN[MkV[m: kind, cV: v, xId: n.first]]
    A binding to a simple id results in a simple tuple being made.
    ELSE {t: V=EvalName[
        RETURN[MkBinding[
    A binding to a qualified name is handled by copying the binding corresponding to the prefix of the name (it will be bound to a node 
    value) with a new binding tuple added to the end of the node value.  Any future use of EvalName for the name will find that 
    binding first.
    };
Lookup: PROC[
looking up a simple identifier in an environment is a simple process of marching down the environment list until a binding to that 
identifier is reached or the end of the list is encountered. 
    One: PROC[
        b: V _ IF 
        IF 
        see note in Eval to explain why the above statement hangs bindings off sentinels put into environments
        RETURN[b.cV]};
Looking up a qualified name requires that we first evaluate the prefix of the name without its last identifier using EvalName 
(which must yield a node value), then make an environment from the bindings of that prefix in which to look up the suffix 
identifier.
    RETURN[ IF n=NIL THEN NIL ELSE One[Bindings[EvalName[
EvalName: PROC[
Lookup the name and evaluate it (it might be a quoted term)
    RETURN[Eval[
Eval: PROC[
    IF v.m=quotedTerm THEN {
        ePlus: Env = CONS[MkV[m: evalSentinel, xId: NIL], 
        We push an evalSentinel tuple on the top of the environment so that lookup can collect all the bindings needed to evaluate this 
        quotedTerm.  The environment that ends up hanging off the evalSentinel tuple is then tucked away in the vOfQ tuple that 
        results from evaluating the quoted term.  This information is needed to correctly externalize an indirection that caused a 
        quoted term to be evaluated.  If the bindings in the xEnv component of the vOfQ tuple match those extant when the document 
        is externalized, then the editor can simply output the indirection that gave rise to the vOfQ tuple safe in the assurance 
        that its value is unchanged.
        RETURN MkV[m: vOfQ, cV: S[ePlus, v.cTerm].first, xEnv: ePlus.first.xEnv]}
    ELSE RETURN[BaseVal[v]]};
BaseVal: PROC[v: V] RETURNS [V] = { RETURN[
    (IF v.m=evalStruc THEN BaseVal[v.cV.cV -- it must be a vOfQ--] ELSE v)]};
Op: TYPE={ADD, SUB, MUL, DIV, LT, EQ, SUBSCRIPT};
IsInteger: PROC[v: V] RETURNS[BOOL] = {    -- check that a number is actually an integer
    IF v.m#num THEN Error[WrongType];
    RETURN[Real.Float[Real.Fix[v.cNum]]=v.cNum]};
Apply: PROC[op: Op, v1: V, v2: V] RETURNS [result: V] = {
apply the operator to the base values for v1 and v2 (i.e., evaluate any quoted terms that you encounter)
    vv1: V = BaseVal[v1]; vv2: V = BaseVal[v2];
    SELECT op FROM
        ADD, SUB, MUL, DIV, LT => {IF vv1.m#num OR vv2.m#num THEN Error[WrongType];
            RETURN[(SELECT op FROM
                ADD => MkV[m: num, cNum: (vv1.cNum+vv2.cNum)],
                SUB  => MkV[m: num, cNum: (vv1.cNum-vv2.cNum)],
                MUL => MkV[m: num, cNum: (vv1.cNum*vv2.cNum)],
                DIV  => MkV[m: num, cNum: (vv1.cNum/vv2.cNum)],
                LT   => (IF vv1.cNum<vv2.cNum THEN True ELSE False),
                ENDCASE => NIL)]};
        EQ   => {IF v1.m#v2.m THEN RETURN[False];
        Tuple marks have to be the same if they are to be equal
            SELECT vv1.m FROM
                node => RETURN[False];    -- two separate node values are never equal
                atom => RETURN[IF (vv1.cId=vv2.cId) THEN True ELSE False];
                num => RETURN[IF (vv1.cNum=vv2.cNum) THEN True ELSE False];
                string => RETURN[IF String.EqualString[vv1.cStr, vv2.cStr] THEN True ELSE False];
                ENDCASE => Error[WrongType]};
        SUBSCRIPT =>    -- use v2 as an index into the node v1
            IF NOT IsInteger[vv2] OR vv1.m#node THEN Error[WrongType]
            ELSE RETURN[NthItem[Contents[Items[vv1]], Real.Fix[vv2.cNum]]];
        ENDCASE};
Tags: PROC[vs: Vs] RETURNS [Vs --tag items only--] = {RETURN[(Sort[GetTags[vs]])]};
GetTags: PROC[vs: Vs] RETURNS [Vs --tag items only--] = {
    One: PROC[v: V] RETURNS[Vs] = {RETURN[(IF v.m=tag THEN MkVs[v] ELSE NIL)]};
    RETURN[ IF vs=NIL THEN NIL ELSE ConcatVs[One[vs.first], GetTags[vs.rest]] ]};
Sort: PROC[vs: Vs --tag items only--] RETURNS [sVs: Vs --tag items only--] = {--[note 2]--};
Items: PROC[v: V] RETURNS [Vs] = {
    IF v.m=node THEN RETURN[RawItems[v.cVs]] ELSE Error[WrongType]};
RawItems: PROC[vs: Vs] RETURNS[Vs] = {
Pull all the basic items out of a list, burrowing into evalStruc, onodeStruc, or scope tuples to extract their contained items
    One: PROC[v: V] RETURNS[Vs] = {
    Local procedure to look at a single tuple and pull items out of it if it is an evalStruc, onodeStruc, or a scope. Otherwise the 
    value is returned.
        RETURN[(SELECT v.m FROM
            evalStruc, onodeStruc => v.cVs,
            scope => RawItems[v.cVs],
            ENDCASE => MkVs[v])]};
    RETURN[IF vs=NIL THEN NIL ELSE ConcatVs[One[vs.first], RawItems[vs.rest]]]};
Bindings: PROC[v: V] RETURNS [Env] = {
Make an environment from the bindings in the node v.
    IF v.m=node THEN RETURN[GetBindings[Items[v]]] ELSE Error[WrongType]};
GetBindings: PROC[vs: Vs] RETURNS[
    One: PROC[v: V] RETURNS[
        RETURN[IF v.m=bind OR v.m=bindStruc THEN MkVs[v] ELSE NIL]};
    RETURN[IF vs=NIL THEN NIL ELSE ConcatVs[GetBindings[vs.rest], One[vs.first]]]};
RelBindings: PROC[
Find all the relevant bindings for the tags in vs and make an environment from them.
    One: PROC[
        IF v.m#tag THEN Error[WrongType];
        RETURN[RefineBindings[        EvalName[
    RETURN[IF vs=NIL THEN NIL ELSE ConcatVs[One[
RefineBindings: PROC[
For every identifier bound non-structurally in basis, place a binding in the result Vs.  If there is a binding in the environment 
for the identifier use that; if not, use the one for it in basis (which provides a default value).
    One: PROC[
        IF b.m#bind AND b.m#bindStruc THEN Error[WrongType];
        RETURN[IF b.m=bindStruc THEN NIL
            ELSE MkVs[MkV[
            m: bind, cV: Lookup[ConcatVs[
    RETURN[IF basis=NIL THEN NIL
        ELSE ConcatVs[RefineBindings[
PullDefaults: PROC[
For every binding in the environment return a binding that gives a default value for that identifier.  Every identifier in the 
incoming environment is bound to a TYPE$ node (because relBindings reaches into the attributes attribute for a TAG$ node to get 
these type definitions).  The result Vs is not an environment, but is in the same left-to-right order as the bindings occuring in 
the attributes attribute for the TAG$ node from which they came. 
    One: PROC[b: V] RETURNS [Env] = {
        IF b.cV.m=node AND HasTag[Items[b.cV], $TYPE] THEN
            RETURN[MkVs[MkV
            [m: bind, cV: Lookup[Bindings[b.cV], MkName[$default]], xId: b.xId]]]
        ELSE Error[WrongType]};
    RETURN[IF 
HasTag: PROC[vs: Vs, id: Id] RETURNS[BOOL] = {
Checks whether a given id occurs in the list of tags.
    RETURN[ IF vs.first.m=tag AND vs.first.cId=id THEN TRUE ELSE HasTag[vs.rest, id] ]};
Contents: PROC[vs: Vs] RETURNS [Vs] = {
The contents of a node are all the tuples in it except tags and non-structural bindings (it thus includes all other values, 
structural bindings, structured openedNodes, and indirections).
    One: PROC[v: V] RETURNS [Vs] = {
        RETURN[(IF v.m=tag OR v.m=bind THEN NIL ELSE MkVs[v])]};
    RETURN[ IF vs=NIL THEN NIL ELSE ConcatVs[One[vs.first], Contents[vs.rest]] ]};
2.3.3.  Notes for semantics
The semantics deal almost exclusively in immutable values, i.e., values which, once created are never changed.  As a result, it 
copies values prodigiously.  Any implementation of Interscript semantics should attempt to use pointers wherever possible to avoid 
extraneous copies and copying of values.
[1]:  All the relevant attributes for each tag are listed at the end of a node, even if the node's tags' relevant attribute sets 
are not disjoint.
To place some bindings in a node without regard for using a specific node type indicated by tags, each binding must be a structural 
binding; e.g.,
abbrev _ {a%_5 b%_7}
...
{ . . . abbrev^| . . .}
[2]: the only requirements on SORT are that it sort names correctly and eliminate any duplicates.  More precisely, if N* is a list 
of names to be sorted and S* is the result of sorting N* and removing duplicates, then
(
2.3.4  Equivalence of scripts
    A script s1 is equivalent to a script s2 if and only if S[X,s1]=S[X,s2].
One way to test Interscript-conforming editors might be to develop a set of test scripts, and, for each script in the set, check it 
for equivalence with a version of the script obtained by internalizing and then externalizing the script using the editor in 
question.

3. Tags, Node Invariants, and Safe Editing
3
                                                                        Tags, Node Invariants, and Safe Editing
                                                                                                                                    
     
3
    Interscript makes it possible for editors to manipulate the parts of documents they understand without altering parts they do not.  
    This section describes the facilities that make this possible and a set of safety rules that will enable editors to preserve 
    this property.
    The first part of this section gives the details of how node tags are "defined" in the standard.  The second part describes the 
    invariant associated with a node and how it is specified. And the third part shows how a tag's definition can be used by an 
    editor to treat nodes with that tag safely even if the editor does not specifically implement the tag.
3.1.  Tags, types, and node invariants
    This section uses the Interscript base language syntax and semantics to define precisely the properties of tags, which is 
    accomplished by binding a TAG node to the tag's name.  To make a tag definition precise, we will introduce the notion of type, 
    which is not embedded in the base language, but rather is defined using it.  Then it will actually be possible to define 
    (recursively) the properties of the tag TAG, including the notion of invariants for nodes, which can be used by an editor to 
    test whether an edit is legal, i.e., invariant-preserving. 
3.1.1.  Defining tags
    For every tag T used in a script there must be a corresponding definition of the important properties of that tag.  This is 
    accomplished by (structurally) binding it to a TAG node, which is used to capture some of T's properties so as to enable 
    editors to operate on T nodes without necessarily having knowledge of T built into the editor.  A TAG node defining T will have 
    the following relevant attributes:
    attributes: a TYPE node (see definition of TYPE below) with each relevant attribute of T bound to a TYPE node containing the 
    default value for the relevant attribute, and its type, which is a predicate on the attribute's value in a given node.  If no 
    attributes are specified, the default is that the tag has no relevant attributes.
    contentType: a TYPE node value specifying the type(s) of the contents allowed in a T node.  The default type is Any (see definition 
    of TYPE below).
    nodeInvariant: a predicate expressible as an Interscript (quoted) expression that must be true of any T node.  Its default value is 
    True.
    hasMoreInv: a Boolean indicating whether there is more invariant for T than can be expressed in T's TAG node; if hasMoreInv is 
    False then T's invariant is completely captured by the information in its TAG definition.  Its default value is True.
    requiredTags: an untagged node listing as atoms all the tags other than T that are required to be on a T node (this is a way of 
    coupling T's invariant to the invariants for other tags).  The default is that no other tags are required.
    reducesTo:  either a quoted expression of the form '{...}' that can be used to reduce a tag to an equivalent (but more basic) tag 
    by treating the appearance of T$ as T.reducesTo%|, or NIL if T does not reduce to any other tag.  The default is NIL. 
    tagOnly: a Boolean indicating that a node with this tag need only be viewed by its parent as a node consisting of just its tags for 
    the purpose of checking the parent's invariant.  The default is True. 
3.1.2  Defining types
In order to define the attributes and contentType attributes of a TAG definition, we must be able to define types.  This is done by 
having a standard tag, TYPE, which can be used to construct TYPE nodes to define types.  The definition of the tag TYPE can be 
"described" in Interscript (even though such a definition would actually be circular since TAG uses it):
TYPE%_{TAG$
attributes _{
code %_node
tags %_AtomList^
union %_ NIL    -- or a node with a list of TYPE$ nodes as contents
predicate %_{Predicate^ | default%_'1'}
default %_{Any^ | default_NIL --Minimal automatic default--} }
contentType _None^}
    That is, a TYPE node has the relevant attributes code, tags, union, predicate, and default, with the following interpretations:
    code must be an atom and one of atom, evalStruc, node, num, quotedTerm, string, corresponding to a subset of the marks on VRecs in 
    section 2.3.1.  Or, code can have the value Any, meaning that the type being declared may correspond to any of the marks above.
    tags is a list defining what tags a node of this type must possess.
    union, if not NIL, is a node value containing as contents a list of TYPE nodes; in this case, the type being defined is viewed as 
    having to satisfy one or more of the type definitions given in the union.
    predicate gives an Interscript quoted expression which a value must satisfy in order to be a valid instance of the type.  The 
    default value ('1') always yields True.
    default provides a convenient way of specifying a legal default value for a node of the type being defined.  
    A TYPE node has no contents (since it is basically a record with its relevant attributes as components), so its contentType has the 
    type None (see below).
    Note that the above definition of the tag TYPE is circular in that it "uses itself" to define the types of the relevant attributes 
    code, tags, union, predicate, and default.
    Here are a set of standard TYPE definitions for the basic Interscript data types and for types needed for defining tags and types:
    Bool% _ {TYPE$ code_num predicate%_'(A^ EQ 0) + (A^ EQ 1)'}
    Number% _ {TYPE$ code_num}
    String% _ {TYPE$ code_string}

    Atom% _ {TYPE$ code_atom default_0 --no default--}
    AtomList% _ {TYPE$ tags_{ATOMLIST} default_{}}    -- no default
    ATOMLIST% _ {TAG$ contentType_Atom^}

    Any% _ {TYPE$ code_Any}
    None% _ {TYPE$ predicate%_'0'}    --ever False, forbids any content
    Predicate% _ {TYPE$ code_quotedTerm}
    Type% _ {TYPE$ tags_{TYPE} default_Any^}

    To test if a given value v can be considered to satisfy a type with definition T, one can simply invoke the following HasType 
    procedure with the V representations of v and T.  HasType requires that
    (1)  either T.code is Any or T.code matches the mark of the value v; and
    (2)  if T.code is node, then v's tags must be a subset of the tags given in T.tags; and
    (3)  if T is a union type (T.union#NIL), then HasType[v, subtype] must be true for at least one of the subtypes given in the list 
    T.union; and
    (4)  v must satisfy T.predicate.
To help in understanding how HasType works, try testing the value of MkV[m: num, cNum: 2] against the type definition Bool above.  
HasType should return FALSE.
HasType: PROC[v: V, T: --node--V] RETURNS[r: BOOL _ FALSE] = {
    bt: Env = Bindings[T];
    IF NOT (EqName[Lookup[bt, MkName[$code]].xName, MkName[$Any]] OR EqName[Lookup[bt, MkName[$code]].xName, MkName[MarkAsAtom[v.m]]])
        THEN RETURN[FALSE];
    IF EqName[Lookup[bt, MkName[$code]].xName, MkName[$node]] THEN
        IF NOT Subset[Tags[v.cVs], Lookup[bt, MkName[$tags]].cVs] THEN RETURN[FALSE];
    FOR tt: Vs _ Lookup[bt, MkName[$union]].cVs, tt.rest UNTIL tt=NIL DO
        IF HasType[v, tt.first] THEN EXIT;  -- can stop checking unions as soon as one works
        REPEAT
            FINISHED => RETURN[FALSE];     -- v didn't satisfy any of the types in the union
        ENDLOOP;
    RETURN[ ApplyPred[arg: v, pred: Lookup[bt, MkName[$predicate]]] ]};
ApplyPred: PROC[arg: V, pred: V] RETURNS[t: BOOL] = {
    predEnv: Env = CONS[MkBinding[X, arg, MkName[$A], bind], X];   -- just the external environment plus A_arg
    r: V = Eval[
    RETURN[IF r.m#num THEN FALSE ELSE (r.cNum#0) ]};
Subset: PROC[sub, set: Vs--all atoms--] RETURNS[BOOL] = {
    One: PROC[a: V, set: Vs] RETURNS [BOOL] = { RETURN[
        (IF set=NIL THEN FALSE
        ELSE IF EqName[a.xName, set.first.xName] THEN TRUE ELSE One[a, set.rest])]}; 
    RETURN[IF sub=NIL THEN TRUE ELSE (One[sub.first, set] AND Subset[sub.rest, set])]};
3.2.  The invariant associated with a node
    A node in an internalized script is valid if it satisfies the complete invariant for the node.  A node's complete invariant depends 
    on what tags are on the node, the TAG definitions for those tags, and the actual contents of the node and bindings of the 
    relevant attributes of the node's tags.  The following function NodeInvariant evaluates as much of a node's complete invariant 
    as is possible without recourse to any external invariants associated with the node's tags.  NodeInvariant returns the value no 
    if a node is invalid, yes if it is valid, and checkExternalInvariant if it is valid modulo the external invariants of any of 
    its tags that have such. 
    NodeValidity: TYPE = {yes, no, checkExternalInvariant};
    NodeInvariant: PROC[n: --node--V] RETURNS[r: NodeValidity_yes] = {
        IF n.m#node THEN Error[WrongType];
        FOR t: Vs _ Tags[n.cVs], t.rest UNTIL t=NIL DO
            SELECT TagCorrect[n, t.first.cV] FROM
                no => RETURN[no];
                checkExternalInvariant => r _ checkExternalInvariant;
                ENDCASE;
            ENDLOOP;
        RETURN[r]};
    TagCorrect: PROC[n, tagType: --node--V] RETURNS[NodeValidity] = {
        bt: Env = Bindings[tagType];
        RETURN[ IF 
            CheckAttributes[bs: Bindings[n],
                types: Bindings[Lookup[bt, MkName[$attributes]]]] AND
            CheckContents[Contents[n.cVs], Lookup[bt, MkName[$contentType]].cV] AND
            Subset[Lookup[bt, MkName[$requiredTags]].cVs, Tags[n.cVs]]  AND
            ApplyPred[arg: Stripped[n], pred: Lookup[bt, MkName[$nodeInvariant]]]
                THEN (IF Lookup[bt, MkName[$hasMoreInv]].cNum=1 THEN yes
                ELSE checkExternalInvariant)
            ELSE no]};
    CheckAttributes: PROC[bs, types: Env] RETURNS[BOOL] = {
        One: PROC[b: --bind|bindStruc-- V] RETURNS[BOOL] = {
            t: --bind|bindStruc-- V = Lookup[types, MkName[b.xId]];
            RETURN[IF t=NIL THEN TRUE ELSE HasType[b.cV, t.cV]]};
        RETURN[IF bs=NIL THEN TRUE
        ELSE (One[bs.first] AND CheckAttributes[bs.rest, types])]};
    CheckContents: PROC[vs: Vs, type: V] RETURNS[BOOL] = {
        One: PROC[c: V] RETURNS[BOOL] = {
            RETURN[IF c.m=bindStruc THEN TRUE ELSE HasType[c, type]]};
        RETURN[IF vs=NIL THEN TRUE ELSE (One[vs.first] AND CheckContents[vs.rest, type])]};
    Stripped: PROC[n: --node--V] RETURNS[--node--V] = {
        RETURN[MkV[m: node, cVs: Strip[Items[n]]]]};
    Strip: PROC[vs: Vs] RETURNS[svs: Vs] = {
        One: PROC[v: V] RETURNS[Vs] = {
            SELECT v.m FROM
                bind =>  RETURN[MkVs[MkV[m: bind, cV: One[v.cV].first, xId: v.xId]]];
                node => RETURN[IF ShouldStrip[v] THEN MkVs[MkV[m: node, cVs: Tags[v.cVs]]]
                ELSE MkVs[Stripped[v]]];
                ENDCASE => RETURN[MkVs[v]]};
        RETURN[IF vs=NIL THEN NIL ELSE ConcatVs[One[vs.first], Strip[vs.rest]]]};
    ShouldStrip: PROC[v: --node--V] RETURNS[BOOL] = {
        <<all of v's tags must agree on tagOnly for it to be true>>
            tags: Vs = Tags[v.cVs];
            FOR tags: Vs _ Tags[v.cVs], tags.rest UNTIL tags=NIL DO
                tDef: V = tags.first.cV;
                btDef: Env = Bindings[tDef];
                IF Lookup[btDef, MkName[$tagOnly]].cNum#1 THEN RETURN[FALSE];
                ENDLOOP;
            RETURN[TRUE]};
3.3 Safety rules for editors
    Using TAG definitions and the concept of NodeInvariant, we can give some conservative rules for editors in treating parts of 
    documents that correspond to nodes in a script.  These are simply observations about how a node's complete invariant can be 
    affected by various editing operations.
Rendering a node
    An editor may display any node at all since this is not an editing operation. If it does not implement any of the tags on a given 
    node, the editor could still display the external form of the node in the Interscript publication format (since it has to be 
    able to externalize the document as a script anyway).  Of course, if an editor implements one or more of the tags on a node, it 
    can use those mechanisms to render the node and use the more basic, generic display mechanism of script externalization for the 
    attributes or content that it does not directly implement. 
Editing a node
All editing operations within a node can be composed from the following atomic operations:
(1)  replacing a value (whether of an attribute, an item of content, or the value of a structured binding) by another value of the 
correct type;
(2)  removing an item of content (includes structured bindings and subnodes)
(3)  adding an item of content (includes structured bindings and subnodes)
(4)  adding a tag to a node
(5) removing a tag from a node
In general, an atomic operation affects a subset of the tags on a node in the sense that it could potentially affect the invariants 
of those tags (e.g., if it changed the value of an attribute relevant to them).  A single-node sequence of operations is one that 
affects exactly one node.
    A sequence of atomic operations that maintains the invariants of all nodes affected by the edit is called valid.
    What valid single-node operations can an editor perform on nodes that it does not directly implement?  Here is a general, safe 
    mechanism:
    save a copy of the node somewhere;
    perform the operation sequence;
    if the node's invariant and its parent's invariant are true allow the edit, otherwise restore the original state.

6.  Other Constructs
6
                                                                                                            Other Constructs
                                                                                                                                    
     
6
6.1.  LABELs and references
    Nodes with tag LABEL$ provide "references" from one part of a script to another.  It has one relevant attribute, labels, which is a 
    list containing a mixed sequence of atoms (names) or tagged nodes.  The atoms are the actual labels on the LABEL node.  A 
    tagged node in the sequence provides both a label name (its tag name) and parameters (its relevant attributes)  A LABEL tag can 
    coexist with any other tag on a node.
    To denote a reference from one part of a script to another, you (the editing system that is externalizing some document) label the 
    node to be referred to with some atom.  Then you bind that same atom to a relevant attribute of the node that needs to "refer" 
    to the labeled node.
    When multiple nodes have the same label, it can be used to represent a relationship among those nodes (colors for matching the 
    logical and layout structures use labels this way).  It is assumed that an editor that implements labeled nodes will choose an 
    efficient implementation of them with some form of fast lookup to find a list of all the node representatives in the editable 
    form of the document.  The TAG definition for LABEL is  
    LABEL %_ {TAG$
        attributes _ {TYPE$
            labels %_ {TYPE$ code_node union_{Atom^ NodeList^} default _ {}}
            }
        }
          

Appendix A

GLOSSARY
A
                                                                                                                              
GLOSSARY
                                                                                                                                    
     
A

Italics indicate words defined in this glossary.
abbreviation    An invocation used to shorten a script, rather than to indicate structure
attribute    A (name, value) pair, identified by its name, which is bound to a value 
atom    A name denoting only itself.  Occurrences of the same atom in different parts of a script all denote the same primitive 
value.
base language    The part of the Interscript language that is independent of the semantics of particular tags
base semantics    The semantic rules that govern how scripts in the base language are elaborated to determine their contents and 
attributes 
binding        The operation of associating a value with a name; also the resulting association 
contents        The vector of values denoted by a node of a script 
document    The internalization of a script in a representation suitable for some editor
dominant structure    The tree structure of a document corresponding to the node structure of its script 
elaborate    (verb) To develop the semantics of a script or a node of a script according to the Interscript semantic rules. This is 
a left-to-right, depth-first processing of the script 
encoding        A particular representation of scripts.  The encoding presented in this standard is the publication encoding for 
writing about Interscript. 
environment    A value consisting of a sequence of attributes.  An environment may be either free-standing or nodal.  A 
free-standing environment is a structured value much like a record, with the components being the attributes of the environment.  A 
nodal environment is associated with a node of a script and represents the attributes bound in that node. 
expression    A syntactic form denoting a value 
external environment    A standard environment in whose scope an entire script is elaborated.  This environment is named X. 
externalization    The process of converting from a document to a script.
fidelity        The  extent  to  which  an externalization or internalization preserves contents, form, and structure
hierarchical name    A name containing at least one period, whose prefix unambiguously denotes the naming authority that assigned 
its meaning
identifier    A sequence of letters used to identify an attribute 
integer    A mathematical integer in a limited range 
indirection    The appearance of a name%, denoting the evaluation of name plus the necessity of remembering that the result came 
from evaluating name.
internalization    The process of converting from a script to a document; also the result of that process
Interscript    The name of this editable document standard 
invocation    The appearance of a name^ (see also indirection) 
literal        A representation of a value  of a primitive type in a script 
local binding    A binding of a value to a name, causing the current environment to be updated with the new attribute; any outer 
binding's scope will resume at the end of the innermost containing node  
name        A sequence of identifiers internally separated by periods; e.g., a.b.c 
nested environment    The initial environment of a node contained in another node 
node        Everything between a matched pair of {}s in a script; this may represent a branch point in a document's dominant 
structure, or it may simply be a structured value acting as a vector, a record-like value, or a mixture of the two.
number    The Interscript primitive type for representing numeric values.
primitive type    Boolean, number, string, or atom  
primitive value    A literal or a node, vector, or environment containing only primitive values  
property        Each tag on a node labels it with a property; the properties of a node determine how it may be viewed and edited  
publication encoding    The encoding for scripts used for examples to be read by people rather than editing systems. 
qualified name    see hierarchical name.
quoted expression    A value which is an expression bracketed by single quotes ("'"); the expression is evaluated by invoking the 
name (with or without indirection). 
real        A floating point number 
scope        The region of the script in which invocations of the attribute named in a binding yield its value; the scope starts 
textually at the end of the binding, and generally terminates at the end of the innermost containing node 
script        An Interscript program; the interchangeable result of externalizing a document 
string        A literal which is a sequence of characters bracketed by double quote marks, e.g., "This is a string!" 
style        A quoted expression to be invoked in a node to modify the node's environment, labels, or contents 
tag        An atom labelling a node using the syntax atom$; the properties of a node correspond to the set of tags labelling it 
value        A primitive value, node, or quoted expression 
X        The standard outer environment for an entire script
End of Glossary
Appendix B

Example of Script Evaluation
B
                                                                                    Example of Script Evaluation
                                                                                                                                    
     
B

This appendix contains a simple example of a script as it is 
(a) written in the publication encoding,
(b) represented in terms of the abstract grammar, and
(c) represented in terms of value items.
In the process of going from (a) to (b) the script is parsed; in going from (b) to (c) the parse tree is evaluated using the 
semantics defined in section 2.3.
The source script
{aTag$
    relV1 _ 0        -- relV1 is a relevant attribute of an X node
    v _ relV1^+5    -- v is not a relevant attribute of an X node
    q%_    -- bind a quoted expression to q, which is not a relevant attribute
        '{"FalseString" "TrueString"}!(relV1^ LT v^)'    -- conditional expression
    "content"        -- simple string as content
    q%        -- evaluate q (remembering that the result came from q)
    } 
For purposes of this example, assume that the tag X is defined as follows:
    aTag%_{TAG$ 
        attributes _ {
            relV1%_ Number^
            relV2%_ {String^| default_"relV2 default value"}}
        contentType _ String^
        reducesTo _ NIL}

Abstract version of the source script
script(
    node( items(items(items(items(items(items(
        item( tag(name(id(X))))            -- aTag$
        item( binding(name(id(relV1)) term(primary(number(0)))))    -- relV1 _ 0
        item( binding(            -- v _ relV1^+5
            name(id(v))
            term(
                term(primary(invocation(name(id(relV1)))))
                op(ADD)
                primary( number(1))
                )
            ))
        item( sBinding(        -- q%_
            name(id(q))
            sRhs(quoted(term(
                term(primary(
                    node(items(items(    -- '{"FalseString" "TrueString"}!(relV1^ LT v^)'
                        item(term(primary(literal(string("FalseString")))))
                        item(term(primary(literal(string("TrueString")))))
                        )))
                    ))
                op(SUBSCRIPT)
                primary(term(
                    term(primary(invocation(name(id(relV1)))))
                    op(LT)
                    primary(invocation(name(id(v))))
                    ))
                )))
            ))
        item( term(primary(literal(string("content")))))    -- "content"
        item( indirection(name(id(q))))        -- q%
        )))))))
    )

Elaboration of abstract script's semantics
[m: node,
    cVs: LIST[
        [m: tag, xId: aTag],
        [m: bindStruc,
            cV: [m: quotedTerm,
                term(
                    term(primary(
                        node(items(items(
                            item(term(primary(literal(string("FalseString")))))
                            item(term(primary(literal(string("TrueString")))))
                            )))
                        ))
                    op(SUBSCRIPT)
                    primary(term(
                        term(primary(invocation(name(id(relV1)))))
                        op(LT)
                        primary(invocation(name(id(v))))
                        ))
                    )],
            xId: q],
        [m: string, cStr: "content"],
        [m: evalStruc,
            [m: vOfQ,            -- note the carried-along environment
                [m: string, cStr: "TrueString"],
                xEnv: LIST[[m: bind, cV: [m: num,0], xId: relV1] [m: bind, cV: [m: num,5], xId: v]]
                ],
            xName: LIST[$q, NIL],
        [m: bind, cV: [m: num, 0], xId: relV1],        -- an explicit binding
        [m: bind, cV: [m: string, "relV2 default value"], xId: relV2]    -- from aTag's definition
        ]    -- end of cVs LIST
    ]       -- end of the node