2. The Language Basis: Syntax and Semantics
2.2. Discussion of Features
[Note that we have a formal semantic definition for this language that is every bit as precise as the grammar above. However, we have not yet figured out how to present it in a form that humans find equally palatable, so we have placed it in Appendix C.]
term ::= primary
primary ::= literal
literal ::= Boolean | integer | real | string | universal | NIL
Literals are the primitive elements by which the values in a document are represented. There are two constants of type Boolean, T (true) and F (false). Integer, real, and string literals are denoted as in InterPress <<REF>>. Universals are like Lisp atoms; a universal is denoted by an identifier formed from uppercase letters and digits only. A universal denotes only itself, nothing more. As well as its normal uses to denote null elements, NIL has the additional property that when used as a piece of content in a span, it disappears, as if nothing had been written there, i.e., {1 NIL 2} is the same as {1 2}, which is the same as {NIL 1 NIL 2 NIL NIL}, etc.
term ::= primary op term
op ::= "+" | "—" | "*" | "/"
Both the primary and the term must reduce to numbers; the arithmetic operators are evaluated right-to-left (a la APL, without precedence) and bind less tightly than function application. The result is a real if either operand is real.
application ::= id
Id is looked up in the context of the current span (section <<LookupRule>> describes the lookup rule in detail, but it is basically dynamic scoping following the span nesting structure, with deeper and more-to-the-right bindings masking those higher and more-to-the-left in the tree corresponding to span nesting). Depending on its current binding, the value of id could be a piece of content, a binding, or a label; if the rhs bound to id was quoted, that expression is evaluated in the current context. In the outermost context X, every identifier is bound to the universal formed by replacing each letter of the identifier by its uppercase equivalent and each digit in the identifier by itself.
application ::= name "." id
A qualified name represents lookup in the structured context of a span value; name must be bound to a span, in which context id is looked up.
application ::= universal
A value may be bound to a universal; this value is only used when the universal appears as a tag in a span. This mechanism can be used to provide some global default value(s) or sequence of bindings, or whatever for that universal, or nothing if its global value is NIL.
application ::= ( name | universal ) "!" content
The dyadic operator "!" is used to pass arguments in an application. The value of the content is bound to the id value in a new, empty context nested within the current context and then the name or universal is invoked. This nesting ensures that any bindings or side effects obtained while evaluating context (with the exception of global bindings) will not be visible in the context in which the application appears. If a quoted expression is bound to the name, the arguments to it can be accessed using the name value.
If the application involves a universal (either explicitly, or because the name is bound to a universal), the (builtin) function corresponding to that universal is applied to the argument value. Part of the definition of Layer 2 will involve the specification of a small set of standard functions, which may be expanded in various Layer 3 extensions.
In any application, the name may be followed by "%" to indicate that the indirection reflects structure that must be maintained in the document. Replacing the indirection by its value in the current context is a value-preserving loss of structural fidelity. (An invocation that is simply a name is an abbreviation that need not be preserved.)
conditional ::= "IF" term "THEN" item* "ELSE" item* ";"
This is a standard conditional item sequence. The value and effect are those of item1* if the term evaluates to T in the current context, those of item2* if it evaluates to F. Only one of item1* or item2* will be evaluated, depending on the value of term.
<<EDIT POINT>>
span ::= "{" item* "}"
Spans form nested contexts, and affect the containing context only through global (:=) bindings to ids.
item* ::= ""
The empty sequence of items has no value and no effect; this is the basis for the following recursive definition.
item* ::= item1 item*
In general, the value of a sequence of items is just the sequence of item values; bindings change the context of items to their right in the sequence.
subscription ::= primary "(" term ")"
Primary must evaluate to a span for subscription to be meaningful. The value of term must be a non-negative integer. Then the term'th piece of content in the span denoted by primary is extracted as the value of the subscription.
local ::= name "←" rhs
This adds a single binding to the current scope (i.e., to its associated context); bindings have no other "side effects" and no value (i.e., they do not change the length of a containing span value).
global ::= ( name | universal ) ":=" rhs
This adds a single binding to the outermost environment X. It makes sense to bind something to a universal only if the universal is a tag name (see tag below).
binding ::= name mode op term
"name mode op term" is just a convenient piece of syntactic shorthand for
"name mode name op term".
mode ::= "←" | ":="
A value can be bound to a name either locally ("←") in the environment of the span in which the binding appears, or globally (":=") in the environment of the root span of a script.
rhs ::= "'" item* "'"
A quoted rhs is evaluated in the environment of invocation, rather than the environment current at the point of binding.
rhs ::= "{" binding* "}"
This creates a new span value that may be used much like a record.
openSpan ::= "\" item
This opens the span value designated by item, essentially stripping away its bounding braces. For Example
os←{a𡤃 b←T} { \os c𡤋}
is equivalent to
{a𡤃 b←T c𡤋}.
tag ::= universal "$"
This gives the containing span the property denoted by the universal. It also looks for a binding to the universal in X, the outermost environment; if a binding exists, it is invoked in the current context. This gives an easy way to attach a tag to a span and provide a set of defaults associated with the tag.
link ::= "LINKS" id
This introduces the link set whose main name component is id, and defines its scope.
link ::= name "^"
This identifies the immediately containing span as a source of the link name (like a reference to the set of spans which are link targets).
link ::= name ":"
This identifies the immediately containing span as a target of each of the links that is a prefix of name. For example, the link target "id1.id2...idn:" would make the span containing it a target in the link sets for id1, id1.id2, ..., id1.id2...idn.
A
PPENDIX A
GLOSSARY
Italics indicate words defined in this glossary.
abbreviation An invocation used to shorten a script, rather than to indicate structure
attribute A component of an environment, identified by its name, which is bound to a value
base language The part of the Interscript language that is independent of the semantics of particular properties and attributes
base semantics The semantic rules that govern how scripts in the base language are elaborated to determine their contents, environments, and labels
binding The operation of associating a value with a name to add an attribute to an environment; also the resulting association
binding mode A value may be bound to an identifier as local, const or global
Boolean An enumerated primitive type (F, T) used to control selection and as primitive values
const binding A binding of an attribute that prevents its being rebound in any contained scope
contents The vector of values denoted by a span of a script
definition Another name for a const binding
document The internalization of a script in a representation suitable for some editor
dominant structure The tree structure of a document corresponding to the span structure of its script
editor-specific name A non-standard name used by a specific editor in scripts it generates; an editor may use editor-specific terms without interfering with the interchangeability of a script if it provides definitions of the standard names in terms of its editor-specific names
elaborate (verb) To develop the semantics of a script or a span of a script according to the Interscript semantic rules. This is a left-to-right, depth-first processing of the script
encoding A particular representation of scripts
environment A value consisting of a set of attributes. An environment may be either free-standing or nodal. A free-standing environment is a structured value much like a record, with the components being the attributes of the environment. A nodal environment is associated with a span of a script and represents the attributes bound in that span.
expression A syntactic form denoting a value
external environment A standard environment relative to which an entire script is elaborated
externalization The process of converting from a document to a script; also the result of that process
fidelity The extent to which an externalization or internalization preserves contents, form, and structure
hexInt A component of a hexSequence formed from a pair of letters in the set {A,B,...,O,P}, and representing an integer in the range [0..256)
hexSequence A sequence of hexInt pairs enclosed between "#" pairs and used to encode characters in string literals, e.g., #ENCODE#
hierarchical name A name containing at least one period, whose prefix unambiguously denotes the naming authority that assigned its meaning
identifier A sequence of letters used to identify an attribute
integer A mathematical integer in a limited range; one of the primitive types
interchange encoding The standard encoding for scripts
internalization The process of converting from a script to a document; also the result of that process
Interscript The current name of this basis for an editable document standard
invocation The appearance of a name in an expression, except as the attribute of a binding
label A tag, or a source, a target, or a link introduction placed in a span
link The cross product of a source and a target; in general, a link is a set of (source, target) pairs; in the special case when there is exactly one source and one target, a link behaves like a directed arc between a pair of spans
link introduction The appearance of LINKS id in a span, where id is the main identifier of a link
literal A representation of a value of a primitive type in a script
local binding A binding of a value to a name, causing the current environment to be updated with the new attribute; any outer binding's scope will resume at the end of the innermost containing span
name A sequence of identifiers internally separated by periods; e.g., a.b.c
nested environment The initial environment of a span contained in another span
NIL A name for the empty value; it does not lengthen a vector or span in which it appears
span Everything between a matched pair of {}s in a script; this generally represents a branch point in a document's dominant structure
NULL Identifies the empty environment; the value it associates with any identifier is NIL
OUTER A standard attribute of every environment:
For a free-standing environment (i.e., a record-like, structured value), OUTER=NULL
For a nodal environment, OUTER's value is the environment of the current span's parent just prior to the start of the current span.
For the root span of a document, OUTER=X.
For X, OUTER=NULL
global binding A kind of binding (indicated by ":=") that modifies the environment of the root span of a document only, and hence may endure beyond the end of the current span and may be seen by spans to the right of the current span, even those not hierarchically descended from the current span.
primitive type Boolean, Integer, Real, String, or Universal
primitive value A literal or a span, vector, or environment containing only primitive values
private encoding One of a number of non-standard encodings of a script
property Each tag on a span labels it with a property; the properties of a span determine how it may be viewed and edited
quoted expression A value which is an expression bracketted by single quotes ("'"); the expression is evaluated in each environment in which the identifier to which it is bound is invoked
real A floating point number
scope The region of the script in which invocations of the attribute named in a binding yield its value; the scope starts textually at the end of the binding, and generally terminates at the end of the innermost containing span
script An Interscript program; the interchangeable result of externalizing a document
selection A conditional form in a script that denotes one of two expressions, depending on the value of a Boolean expression in the current environment
source The set of spans with REF link, which thereby refer to the set of target links.
string A literal which is a vector of characters bracketed by "<>", e.g., <This is a string!>
style A quoted expression to be invoked in a span to modify the span's environment, labels, or contents
Sub A standard component of each environment, which is implicitly invoked to initialize nested environments
SUBSCRIPT A function that can be used to extract a value from a vector,
e.g. SUBSCRIPT[(a b <str>), 3] is the value <str>
tag A universal name labelling a span using the syntax universal$; the properties of a span correspond to the set of tags labelling it
target The set of spans labelled with link:
transparency A characteristic of scripts that allows an editor to identify the spans of a script that it understands and thereby enables it to operate on those spans without disturbing the ones that it doesn't understand
Units A set of definitions relating various typographical and scientific units to the Interscript standard units, meters; e.g., inch=2.54E2*meter, pt=.013836*inch
universal An identifier formed entirely of uppercase letters and digits
value A primitive value, span, vector, environment, universal, or quoted expression
vector An ordered sequence of values that may be subscripted
X The standard outer environment for an entire script; the value of an unbound identifier in X is the universal consisting of the same letters in upper case