File [_CD6_]<interscript>mail>interdoc-t-sep81.mail!1

*start*
22301 00024 USt
Date: 4 Sept. 1981 4:39 pm PDT (Friday)
From: Horning.pa
Subject: Current Level 0/1 Interdoc status/rev. 32
To: Mitchell, Horning, Lampson

[Candidate for circulation to Interdoc↑, with history log removed.]

Edited by Jim H. on 4 Sept. 1981 4:03 pm PDT (Friday).
	Compressed syntactic example.
	Re-order sections, in preparation for circulation to Interdoc↑.
	Label ::= Mark | Source | Target | Link
	R&T&P&M&U > R&B&M&S&T
	external names > universal names
	remove op as literal
	$ hidden #
	Properties(present/absent)/Attributes(have values and scopes)
	Format for description of standard properties
	Editor levels
		Internal definition/invocation
		Nesting of nodes
		Links
	Hints to implementors:
		environment restoration points
		coding tricks for common structures

-------------------
Open questions:
	We should rethink our character assignments.
		check our characterset for disjointness with
			Interpress.DoubtfulChars.
		enlarge op with a few more single-character operators?  %, &, \
	Better way to present semantics?

Not done:

-------------------

				STANDARD CARDS

1. WE ARE DESIGNING A STANDARD FOR INTERCHANGE, NOT EDITING.

2. EVERY SCRIPT IS PARSED BEFORE BEING INTERPRETED.

3. STANDARDIZE CONCEPTS, NOT NAMES.

4. COMBINING DOCUMENTS IS AN EDITOR, NOT AN INTERCHANGE, TASK.

5. GENSYM IS AN EDITOR FUNCTION.

6. CHARACTER-SET AND LEXICAL ISSUES CAN BE RESOLVED AT THE END.

7. GET THE TECHNICAL BASIS RIGHT BEFORE WORRYING ABOUT THE
POLITICAL ISSUES.


-------------------

We envision an Interdoc script being input and viewed in any manner
equivalent to the following:

Parse the entire script, repeatedly
- reducing each expression to its "dominant structure," containing only literals
and structural delimiters, by replacing identifiers by the values to which they
are bound in the current environment, by applying operators, and by removing
binding items,
- determining the properties of each node from its marks,
- recording the sources and targets for all links, and
- transforming the environment as indicated by the binding items (recording the
relevant attributes for each node in a form convenient to the particular editor).


				BASIC INTERDOC

SYNTACTIC EXAMPLE

{Book.example!		      -- Links to this from Book@ and Book.example@
ExampleParagraph				-- Invokes a definition
$UniqueMark12356#			-- Adds a mark
Font←[Font | size←10*pt face←bold]
factorial←'(LT[Value 2] | 1 | Value*factorial(Value-1))'
a:='NOT[EQ[margins.left factorial[5]]]' margins.right←100 r=12.5*pt
(a | margins.left←+5 margins.right←5 | margins.left+←10) -- conditional: Algol68
<text for this node>
}

-- Or, using short identifiers, and omitting comments and unnecessary spaces:

{b.e!ep um#f←[f|s←10*pt fc←bo]fa←'(l[Value 2]|1|Value*fa(Value-1))'
a:='n[e[m.l fa[5]]]'m.r←100 r=12.5*pt(a|m.l←+5 m.r←5|m.l+←10)<text for this node>}

GRAMMAR

item		::= value | binding | label
value		::= term | node | sequence
term		::= primary | primary op term
op		::= "+" | "" | "*" | "/"
primary	::= literal | invocation | application | selection
literal		::= Boolean | integer | hexint | real | string
invocation	::= name | universal
name		::= id ( "." id )*
id		::= letter ( letter | digit )*
universal	::= "$" name
application	::= invocation "[" item* "]"
selection	::= "(" term "|" item* "|" item* ")"
node		::= "{" item* "}"
sequence	::= "(" item* ")"
binding	::= name bindingMode rhs
bindingMode ::= "=" | ":" | ":=" | "←"
rhs		::= value | op term | "'" item* "'" | "[" [ invocation ] "|" binding* "]"
label		::= mark | link
mark		::= invocation "#"
link		::= name "@" | name "!" | id "@!"

SEMANTICS: REDUCTION AND BINDING

R: expression, environment > expression			-- Reduction
B: expression, environment > environment			-- Bindings
R&B<construct>(E) denotes the pair  R<construct>(E); B<construct>(E)

[Unless explicitly given below, B<construct>(E) = E.]

R<primary op term>(E) = R<primary>(E) op R<term>(E)

R<literal>(E) = literal

R&B<id>(E) = R&B<valOf(id, E)>(E)

R&B<name "." id>(E) = R&B<valOf(id, R<name>(E))>(E)

R<"$" name>(E) = "$" name

R<invocation "[" item* "]">(E) = apply(invocation, R<item*>(E), E)

R&B<"(" term "|" item1* "|" item2* ")">(E) =
			if R<term>(E) then R&B<item1*>(E) else R&B<item2*>(E)

R&B<"{" item* "}">(E) = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}";
	      locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))

R&B<"(" item* ")">(E) = "(" R<item*>(E) ")" ; B<item*>(E)

R<>(E) = Nil

R&B<item1 item*>(E) = R<item1>(E) R<item*>(B<item1>(E));
					  B<item*>(B<item1>(E))

R&B<name mode rhs>(E) = Nil; bind(name, mode, R<rhs>(E), E)
	<name mode op term> = <name mode name op term>	-- Syntactic sugar

R<"'" item* "'">(E) = item*			--Usable only in rhs of binding

R<"[|" binding* "]">(E) = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]

R<"[" invocation "|" binding* "]">(E) =
	[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]

R<invocation "#">(E) = R<invocation>(E) "#"

R<link>(E) = link

	-- Subsidiary definitions for R&T

bindingOf(id, E) = locBinding(id, whereBound(id, E)) -- Gets innermost binding

valOf(id, E) = locVal(id, whereBound(id, E))		-- Gets innermost value

whereBound(id, E) =				-- Finds innermost binding
	locBinding(id, E) ~= None		=> E
	locBinding("Outer", E) ~= None	=> whereBound(id, locVal("Outer", E))
	True					=> Null

apply(invocation, value*, E) =
    CASE R<invocation>(E) OF
	"$equal"	=> value1 = value2
	"$greater"	=> value1 > value2
	. . .
	"$subscript"	=> value1[value2]	-- value1: sequence, value2: int
	"$marks"	=> "(" M<inner(value1)> ")"
	"$links"	=> "(" L<inner(value1)> ")"
	"$sources"	=> "(" S<inner(value1)> ")"
	"$targets"	=> "(" T<inner(value1)> ")"
	"$contents"	=> "(" C<inner(value1)> ")"
    ELSE	=> R<invocation>([[Null | "Outer" "=" E] | "Value" "=" value*])

inner("{" value* "}") = value*

bind(id, mode, value, E) =
	bindingOf(id, E) = "="	=> E			  -- Can't rebind constants
	mode = ":=" 			=> assign(id, value, E) -- Assign at right level
	True				=> [E | id mode value]

bind(id "." name, mode, value, E) =
		[E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]

assign(id, value, E) =
	locBinding(id, E) = ":"	=> [E | id ":" value]
	bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
	True				=> E 			-- Can only assign to vars

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
	Null denotes the "empty" environment
	[E | id m e] means "E with id mode m bound to e"
	locBinding(id, E) denotes the binding mode of id in E
		locBinding(id, Null) = None
		locBinding(id, [E | id' m e]) =
			if id=id' then m else locBinding(id, E)
	locVal(id, E) denotes the value locally bound to id in E
		locVal(id, Null) = Nil = ""
		locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

SEMANTICS OF MARKS, LINKS, CONTENTS

M: expression > expression			-- Sequence of marks
L: expression > expression			-- Sequence of links
S: expression > expression			-- Sequence of source links
T: expression > expression			-- Sequence of target links
C: expression > expression			-- Content of node

[These functions all return the non-value, Nil (which does not lengthen a
sequence), except as specified below.]

M<name "#"> = name
M<"(" item* ")"> = M<item*>
M<item1 item*> = M<item1> M<item*>

L<id "@!"> = id
L<"(" item* ")"> = L<item*>
L<item1 item*> = L<item1> L<item*>

S<name "!"> = prefixes(name)
S<"(" item* ")"> = S<item*>
S<item1 item*> = S<item1> S<item*>

T<name "@"> = name
T<"(" item* ")"> = T<item*>
T<item1 item*> = T<item1> T<item*>

C<value> = value
C<label> = Nil

prefixes(id) = id
prefixes(name "." id) = name "." id prefixes(name)



VALUE SPACE

Expressions in an Interdoc script may denote
	literal values:
		Booleans: (F, T)
		integers: ... -3, -2, -1, 0, 1, 2, 3, ...
		reals: 1.2E5, . . .
		strings: <this is a string>
		marks: Xerox.Laurel.Message#
		links: A123@!, Paragraph.Example!, anId@
		universal names: $Authority.Sub.id
	nodes
	sequences of values
	unevaluated expressions
	environments

DISCUSSION

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name $id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (marks and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, , *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
	( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.


				OTHER NOTES

Conservative rules for editor treatment of Interdoc subtrees created by other
editors:
-It's OK to display a node if you understand at least one of its marks.
-It's OK to edit a node if you understand ALL of its (local) marks, and either
don't remove any of them or also understand ALL marks of its parent.
-It's OK to copy a node if that doesn't move any labels outside their scope, and
you understand ALL marks of its new parent.
-it's OK to delete a (subtree rooted at a) node if you understand ALL marks of its
parent.

The "view" of the dominant structure is ALWAYS controlled by the marks on its
nodes.  (E.g., text is not always there to be "shown.")  $hidden# means that a
node is intended not to be viewed at the point where it occurs in the dominant
structure; such nodes will generally be targets of links, placed remotely from
their sources (uses) to extract attributes from an environment (e.g., page or figure
numbers for references).

We will try to use the term "property" to refer to a characteristic associated with
a node by means of a label (mark or link) and "attribute" to refer to a
characteristic whose value is determined by the environment.  Properties are
either present (locally) or absent, whereas attributes commonly have values that
apply throughout a scope.

Level 2 of the standard will be primarily concerned with the definition of
standard properties that are expected to be shared among editors.  For each
standard property, it will describe
	- the associated universal mark that denotes it,
	- the assumptions that this mark implies about the contents of a node
		(values that must/may be present and their intended intepretation),
	- the assumptions that this mark implies about the environment of a node
		(attributes that must be present and their intended intepretation).

Attributes are "relevant" to a node if they are assumed by any of its marks.
In general, a node's environment will also contain bindings for many "latent"
attributes that are either relevant to its ancestors (and inherited by default) or
are potentially relevant to its descendants.

Attributes are inherited only via environments following the dominant
structure.  Thus the choice of a dominant structure to represent scripts from a
particular editor will be strongly influenced by expectations about inheritance. 

Put a value in contents if:			Put a value in environment if:
	effect is local to node			has scope
	is directly edited				is only indirectly edited
	is to be bound locally			needs delayed or global binding

The presentation of this material could be clarified by a table that relates
constructions in the notation to their intended uses and meanings.


				EDITOR LEVELS

Just as with Interpress, the Interdoc standard will apply to editors with varying
capabilities, and it will be important to define some structure to the space of
possibilities.  Dimensions in which we foresee reasonable variation are:

- Invocation/Application: Only built-in definitions - definitions in script.

- Dominant structure: Nodes cannot nest - nesting allowed.

- Other structure: No links - links allowed.


			HINTS TO IMPLEMENTORS

Environment restoration: Some constructs in the language are not allowed to have
a "net effect" on the environment, even though they may contain internal
bindings that will be in effect for some or all of their substructures.  Thus, a
program to process Interdoc scripts must make provision for restoration of saved
environments.  The amount of bookkeeping required to do this can be minimized
by the following observations:
- A rhs always produces a value (which may be an environment), which is
then bound to a name in the environment that obtained when the binding was
encountered.  Thus, occurrence of a bindingMode (=, :, :=, ←) signals a potential
restoration point.
- Any application (but not every invocation), any term containing infix ops,
and any mark will leave the environment unchanged.  Thus the appearance of
an operator, the bracket surrounding an argument list, or # signals a potential
restoration point; the environment to be saved is the one BEFORE the invocation
to its left.  (One-character lookahead before invocation is prudent.)
- Environments are NOT restored at the end of sequences.
- The environment at the end of a node is the final value of the Outer
component of the node's environment; unless the node contains persistent (:=)
bindings, this will be the same as the environment just prior to the node.

Coding tricks for common structures: 
- Attributes that tend to persist across boundaries in the dominant structure (e.g.,
Bravo character looks, which persist from paragraph to paragraph unless
explicitly changed by the user) should generally be represented by VAR (:)
bindings at a high-level node in the script; changes would then be represented
by persistent (:=) bindings at the points where they occur.
- Persistent bindings can also be used to record changes to "running" information
(e.g., section number, figure number).
- Information (e.g., marks) repeated in all subnodes of a node should be bound to
Sub, and hence associated with the subnodes implicitly.
- Very compact encodings can often be obtained by using single-character id's
for common sequences of items; each editor can have its own set, tuned to its
own usages.  When reading a script produced by a different editor, it will be
necessary to know the bindings for these abbreviations.


			CONSCIOUSLY POSTPONED

Lambda expressions.

Sets (Cf. Mitchell's Font example.)
	SET/LIST operators ($append $union ?)

Extend selection to CASE?


				HISTORY LOG

 Bring the syntax up front.
 Further develop parallelism between grammar and semantic equations.
 Write semantic equations in terms of concrete syntax.
 Quote general expressions.
 V, E, C > R, T, E .
 [...] > <...> for quotation of script expressions.
 (E | id←e, m) > [E | id←e, m] for local binding.
 Introduce primary to disambiguate expression* , factor lhs from binding.
 Introduce Sub component to initialize nodes.
 Debug semantics of braces and dot.
 Mode > binding.
 Debug semantics of <id> (fix up indirection).
 Add VAL. 

Edited by Mitchell, 30 July 1981 9:21 pm PDT (Thursday):  Changed
grammar to allow more complete expression syntax; couldn't use "<" or ">" as
operators because they delimit strings.  Moved history log to end of message.

Edited by Mitchell, 31 July 1981 12:20 pm PDT (Friday)
Simplified expression syntax.  Expressions with embedded binary operators are
simply interpreted in a right-to-left fashion; e.g., x←a*b+c means x←a*(b+c). 
Fixed up semantic equations to reflect this.  Exchanged the use of {}s and ()s.

Edited by Mitchell, 7 Aug. 1981 4:40 pm PDT (Friday)
Fixed error in semantics when exchanging the use of {}s and ()s.

Edited by Horning 13 Aug. 1981 4:47 pm PDT (Thursday).
	E(id) > locVal(id, E) 	--Remove conflict with f(E).
	Outer > "Outer"
	Const > "="
	id lookup rule modified (R & T<id>)
	[E | id←e, m] > [E | id m e]
	"." as infix op
	expressions are evaluated left-to-right (except for binding operator)
	Reverse VAL/ENV default for parens.
	bindq > bind
	binding > bindingMode
	expand definition of apply inline
	default T<construct>(E) = E
	add comments to semantic equations

-------------------
R<>(E) = Nothing						-- The empty expression

							-- Expression sequence
R<e1 e*>(E) = R<e1>(E) R<e*>(T<e1>(E))			-- List insert
T<e1 e*>(E) = T<e*>(T<e1>(E))				-- Composition

R<literal>(E) = literal

R<id>(E) = if bindingOf(id, E)=None then id else R<valOf(id, E)>(E)
T<id>(E) = if bindingOf(id, E)=None then E else T<valOf(id, E)>(E)

R<"IF(" e1 "," e2* "," e3* ")">(E) =
	if R<e1>(E) then R<e2*>(T<e1>(E)) else R<e3*>(T<e1>(E))
T<"IF(" e1 "," e2* "," e3* ")">(E) =
	if R<e1>(E) then T<e2*>(T<e1>(E)) else T<e3*>(T<e1>(E))

R<"NOT" p>(E) = if R<p>(E) then False else True

R<p1 op p2>(E) = 
	op = "." 	=> R<p2>([R<p1>(E) | "Outer" = E])
	op = "+"	=> R<p1>(E)+R<p2>(E)
	. . .

R<n m op e>(E) = Nothing						-- Empty list
T<n m e>(E) = bind(n, m, R<e>(E), E)
T<n m "'" e>(E) = bind(n, m, e, E)
T<n m op e>(E) = bind(n, m, R<n op e>(E), E)

R<"{" labels e* "}">(E) = "{" labels R<Sub e*>([Null | "Outer" = E]) "}"
T<"{" labels e* "}">(E) = locVal("Outer", (T<"ENV("Sub e*")">(E)))

R<"(" e* ")">(E) = R<e*>(E)

R<"ENV(" e* ")">(E) = [T<"ENV(" e* ")">(E) | "Outer" = Null]
T<"ENV(" e* ")">(E) = T<e*>([Null | "Outer" = E])

-------------------

Edited by Jim Horning 17 Aug. 1981 10:49 am PDT (Monday)
	R&T<>
	Nothing > ""

Edited by Jim H. on 17 Aug. 1981 4:58 pm PDT (Monday)
	Remove side-effects from all expressions.
	Parentheses purely for grouping (don't hide environment transformations).
	#label > label !
	labels within nodes

Edited by Jim H. on 19 Aug. 1981 9:52 am PDT (Wednesday).
	Rewrite <n m op e> as syntactic sugar.
	structured labels
	re-introduce apply function in R&T<p1 op p2>
	correct syntax for "."
	% for opening an environment (also replaces ENV?)

Edited by Jim H. on 19 Aug. 1981 6:55 pm PDT (Wednesday).
	Drop "%"; ENV() is now the only environment-constructing operator.
	Add SUB operator (first operand: sequence only, second: number only).
	Add atoms, as distinct from ids.
	Fix lhs op rhs syntax.

Edited by Jim H. on 20 Aug. 1981 5:29 pm PDT (Thursday).
	resolve pending questions as per message of 20 Aug. 1981 12:29 pm PDT.
	distinguish syntactically between properties (marks) and labels.
	only the "main" id of a label is declarable.
	eliminate  as an id character.
	eliminate op ids from grammar.
	restructure the grammar for "functional" notation for operators.
	update semantic equations for new grammar, etc.
	fix treatment of unbound qualified names (now produce Nil).

Edited by Jim H. on 21 Aug. 1981 6:58 pm PDT (Friday).
	restore $val.
	move quoting to rhs, allow quoted primaries without parentheses.
	allow an op to be the rhs of a definition.
	eliminate the functions operate, apply, eval by back substitution.
	change semantics of () to allow "record" construction without $env.

Edited by Jim H. on 24 Aug. 1981 6:08 pm PDT (Monday).
	"It's OK to edit a node if you understand ALL of its (local) properties, and
		either don't remove any of them or also understand ALL properties
		of its parent."
	"Put in contents if:				Put in environment if: ..."
	Add connection syntax to syntactically rule out a+←'b.

Edited by Jim H. on 25 Aug. 1981 11:33 am PDT (Tuesday).
	Syntactically separate label references and name invocation.
	Put in distinct syntax in rhs for environment construction.
	Informal semantics of labels.
	( ... ) > [ ... ] in applications; permitting ( ... ) as a primary.

Edited by Jim H. on 25 Aug. 1981 4:08 pm PDT (Tuesday).
	Add sequence as a nonterminal to the syntax.
	State the formal semantics of labels and properties.
	Reorder presentation (hopefully to improve readability).

Edited by Jim H. on 28 Aug. 1981 2:08 pm PDT (Friday).
	' ... ' in rhs
	Restore infix operators, right to left.
	Modify syntax to rule out more nonsense, add semantically meaningful
		nonterminals.
	Introduce special syntax for selections.
	Eliminate side-effects for $subscript (actually, all applications).
	Add application of defined functions.
	Note that Value[ ... ] allows use of temporary (hidden) local definitions,
		Nil[ ... ] allows placement of hidden nodes.
	( ... ) creates list/sequence values (without hiding bindings).
	Tidy up definition of assign, using bind("Outer." ...).
	Introduce value nonterminal into grammar (rule out more nonsense).
	rhs	::= ... | "[" [ lookup ] "|" binding* "]" .
	Change nonterminal lookup to invocation.
	Remove $ name from literal (to invocation).
	Add node operators:
		$properties						-- All #'s
		$marks						-- All !'s
		$references						-- All @'s
		$contents						-- The rest (fringe)
	Restrict $subscript just to sequences, not nodes.


*start*
21130 00024 USt
Date: 8 Sept. 1981 2:10 pm PDT (Tuesday)
From: Horning.pa
Subject: Interdoc Level 0/1 DRAFT and notes
To: Mitchell, Horning, Lampson, Guttag

[Sorry, meant to send this out on Friday afternoon.]
[Based on "Current Level 0/1 Interdoc status/rev. 32" of 4 Sept. 1981.]
[Circulation to Interdoc↑ planned for Wednesday morning, 9 Sept. 1981.]

				STANDARD CARDS

1. WE ARE DESIGNING A STANDARD FOR INTERCHANGE, NOT EDITING.

2. EVERY SCRIPT IS PARSED BEFORE BEING INTERPRETED.

3. STANDARDIZE CONCEPTS, NOT NAMES.

4. COMBINING DOCUMENTS IS AN EDITOR, NOT AN INTERCHANGE, TASK.

5. GENSYM IS AN EDITOR FUNCTION.

6. CHARACTER-SET AND LEXICAL ISSUES CAN BE RESOLVED AT THE END.

7. GET THE TECHNICAL BASIS RIGHT BEFORE WORRYING ABOUT THE
POLITICAL ISSUES.


-------------------

We envision an Interdoc script being input and viewed in any manner
equivalent to the following:

Parse the entire script, repeatedly
- reducing each expression to its "dominant structure," containing only literals
and structural delimiters, by replacing identifiers by the values to which they
are bound in the current environment, by applying operators, by evaluating
selections, and by removing binding items,
- determining the properties of each node from its marks,
- recording the sources and targets for all links, and
- transforming the environment as indicated by the binding items (recording the
relevant attributes for each node in a form convenient to the particular editor).


				BASIC INTERDOC

SYNTACTIC EXAMPLE

{Book.example!		      -- Links to this from Book@ and Book.example@
ExampleParagraph				-- Invokes a definition
$UniqueMark12356#			-- Adds a mark
Font←[Font | size←10*pt face←bold]
factorial←'(LT[Value 2] | 1 | Value*factorial(Value-1))'
a:='NOT[EQ[margins.left factorial[5]]]' margins.right←100 r=12.5*pt
(a | margins.left←+5 margins.right←5 | margins.left+←10) -- conditional: Algol68
<text for this node>
}

-- Or, using short identifiers, and omitting comments and unnecessary spaces:

{b.e!ep um#f←[f|s←10*pt fc←bo]fa←'(l[Value 2]|1|Value*fa(Value-1))'
a:='n[e[m.l fa[5]]]'m.r←100 r=12.5*pt(a|m.l←+5 m.r←5|m.l+←10)<text for this node>}

GRAMMAR

item		::= value | binding | label
value		::= term | node | sequence
term		::= primary | primary op term
op		::= "+" | "" | "*" | "/"
primary	::= literal | invocation | application | selection
literal		::= Boolean | integer | hexint | real | string
invocation	::= name | universal
name		::= id ( "." id )*
id		::= letter ( letter | digit )*
universal	::= "$" name
application	::= invocation "[" item* "]"
selection	::= "(" term "|" item* "|" item* ")"
node		::= "{" item* "}"
sequence	::= "(" item* ")"
binding	::= name bindingMode rhs
bindingMode ::= "=" | ":" | ":=" | "←"
rhs		::= value | op term | "'" item* "'" | "[" [ invocation ] "|" binding* "]"
label		::= mark | link
mark		::= invocation "#"
link		::= id "@!" | name "@" | name "!"

SEMANTICS: REDUCTION AND BINDING

R: expression, environment > expression			-- Reduction
B: expression, environment > environment			-- Bindings
R&B<construct>(E) denotes the pair  R<construct>(E); B<construct>(E)

[Unless explicitly given below, B<construct>(E) = E.]

R<primary op term>(E) = R<primary>(E) op R<term>(E)

R<literal>(E) = literal

R&B<id>(E) = R&B<valOf(id, E)>(E)

R&B<name "." id>(E) = R&B<valOf(id, R<name>(E))>(E)

R<"$" name>(E) = "$" name

R<invocation "[" item* "]">(E) = apply(invocation, R<item*>(E), E)

R&B<"(" term "|" item1* "|" item2* ")">(E) =
			if R<term>(E) then R&B<item1*>(E) else R&B<item2*>(E)

R&B<"{" item* "}">(E) = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}";
	      locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))

R&B<"(" item* ")">(E) = "(" R<item*>(E) ")" ; B<item*>(E)

R<>(E) = Nil

R&B<item1 item*>(E) = R<item1>(E) R<item*>(B<item1>(E));
					  B<item*>(B<item1>(E))

R&B<name mode rhs>(E) = Nil; bind(name, mode, R<rhs>(E), E)
	<name mode op term> = <name mode name op term>	-- Syntactic sugar

R<"'" item* "'">(E) = item*			--Usable only in rhs of binding

R<"[|" binding* "]">(E) = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]

R<"[" invocation "|" binding* "]">(E) =
	[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]

R<invocation "#">(E) = R<invocation>(E) "#"

R<link>(E) = link

	-- Subsidiary definitions for R&T

bindingOf(id, E) = locBinding(id, whereBound(id, E)) -- Gets innermost binding

valOf(id, E) = locVal(id, whereBound(id, E))		-- Gets innermost value

whereBound(id, E) =				-- Finds innermost binding
	locBinding(id, E) ~= None		=> E
	locBinding("Outer", E) ~= None	=> whereBound(id, locVal("Outer", E))
	True					=> Null

apply(invocation, value*, E) =
    CASE R<invocation>(E) OF
	"$equal"	=> value1 = value2
	"$greater"	=> value1 > value2
	. . .
	"$subscript"	=> value1[value2]	-- value1: sequence, value2: int
	"$marks"	=> "(" M<inner(value1)> ")"
	"$links"	=> "(" L<inner(value1)> ")"
	"$sources"	=> "(" S<inner(value1)> ")"
	"$targets"	=> "(" T<inner(value1)> ")"
	"$contents"	=> "(" C<inner(value1)> ")"
    ELSE	=> R<invocation>([[Null | "Outer" "=" E] | "Value" "=" value*])

inner("{" value* "}") = value*

bind(id, mode, value, E) =
	bindingOf(id, E) = "="	=> E			  -- Can't rebind constants
	mode = ":=" 			=> assign(id, value, E) -- Assign at right level
	True				=> [E | id mode value]

bind(id "." name, mode, value, E) =
		[E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]

assign(id, value, E) =
	locBinding(id, E) = ":"	=> [E | id ":" value]
	bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
	True				=> E 			-- Can only assign to vars

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
	Null denotes the "empty" environment
	[E | id m e] means "E with id mode m bound to e"
	locBinding(id, E) denotes the binding mode of id in E
		locBinding(id, Null) = None
		locBinding(id, [E | id' m e]) =
			if id=id' then m else locBinding(id, E)
	locVal(id, E) denotes the value locally bound to id in E
		locVal(id, Null) = Nil = ""
		locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

SEMANTICS OF MARKS, LINKS, CONTENTS

M: expression > expression			-- Sequence of marks
L: expression > expression			-- Sequence of links
S: expression > expression			-- Sequence of source links
T: expression > expression			-- Sequence of target links
C: expression > expression			-- Content of node

[These functions all return the empty string, Nil (which does not lengthen a
sequence), except as specified below.]

M<name "#"> = name
M<"(" item* ")"> = M<item*>
M<item1 item*> = M<item1> M<item*>

L<id "@!"> = id
L<"(" item* ")"> = L<item*>
L<item1 item*> = L<item1> L<item*>

S<name "!"> = prefixes(name)
S<"(" item* ")"> = S<item*>
S<item1 item*> = S<item1> S<item*>

T<name "@"> = name
T<"(" item* ")"> = T<item*>
T<item1 item*> = T<item1> T<item*>

C<value> = value
C<label> = Nil

prefixes(id) = id
prefixes(name "." id) = name "." id prefixes(name)



VALUE SPACE

Expressions in an Interdoc script may denote
	literal values:
		Booleans: (F, T)
		integers: ... -3, -2, -1, 0, 1, 2, 3, ...
		reals: 1.2E5, . . .
		strings: <this is a string>
		marks: Xerox.Laurel.Message#
		links: A123@!, Paragraph.Example!, anId@
		universal names: $Authority.Sub.id
	nodes
	sequences of values
	unevaluated expressions
	environments

DISCUSSION

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name $id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (marks and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, , *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
	( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.


				OTHER NOTES

Conservative rules for editor treatment of Interdoc subtrees created by other
editors:
-It's OK to display a node if you understand at least one of its marks.
-It's OK to edit a node if you understand ALL of its (local) marks, and either
don't remove any of them or also understand ALL marks of its parent.
-It's OK to copy a node if that doesn't move any labels outside their scope, and
you understand ALL marks of its new parent.
-it's OK to delete a (subtree rooted at a) node if you understand ALL marks of its
parent.

The "view" of the dominant structure is ALWAYS controlled by the marks on its
nodes.  (E.g., text is not always there to be "shown.")  $hidden# means that a
node is intended not to be viewed at the point where it occurs in the dominant
structure; such nodes will generally be targets of links, placed remotely from
their sources (uses) to extract attributes from an environment (e.g., page or figure
numbers for references).

We will try to use the term "property" to refer to a characteristic associated with
a node by means of a label (mark or link) and "attribute" to refer to a
characteristic whose value is determined by the environment.  Properties are
either present (locally) or absent, whereas attributes commonly have values that
apply throughout a scope.

Level 2 of the standard will be primarily concerned with the definition of
standard properties that are expected to be shared among editors.  For each
standard property, it will describe
	- the associated universal mark that denotes it,
	- the assumptions that this mark implies about the contents of a node
		(values that must/may be present and their intended intepretation),
	- the assumptions that this mark implies about the environment of a node
		(attributes that must be present and their intended intepretation).

Attributes are "relevant" to a node if they are assumed by any of its marks.
In general, a node's environment will also contain bindings for many "latent"
attributes that are either relevant to its ancestors (and inherited by default) or
are potentially relevant to its descendants.

Attributes are inherited only via environments following the dominant
structure.  Thus the choice of a dominant structure to represent scripts from a
particular editor will be strongly influenced by expectations about inheritance. 

Put a value in contents if:			Put a value in environment if:
	effect is local to node			has scope
	is directly edited				is only indirectly edited
	is to be bound locally			needs delayed or global binding

The presentation of this material could be clarified by a table that relates
constructions in the notation to their intended uses and meanings.


				EDITOR LEVELS

Just as with Interpress, the Interdoc standard will apply to editors with varying
capabilities, and it will be important to define some structure to the space of
possibilities.  Dimensions in which we foresee reasonable variation are:

- Invocation/Application: Only built-in definitions - definitions in script.

- Dominant structure: Nodes cannot nest - nesting allowed.

- Other structure: No links - links allowed.


			HINTS TO IMPLEMENTORS

Environment restoration: Some constructs in the language are not allowed to have
a "net effect" on the environment, even though they may contain internal
bindings that will be in effect for some or all of their substructures.  Thus, a
program to process Interdoc scripts must make provision for restoration of saved
environments.  The amount of bookkeeping required to do this can be minimized
by the following observations:
- A rhs always produces a value (which may be an environment), which is
then bound to a name in the environment that obtained when the binding was
encountered.  Thus, occurrence of a bindingMode (=, :, :=, ←) signals a potential
restoration point.
- Any application (but not every invocation), any term containing infix ops,
and any mark will leave the environment unchanged.  Thus the appearance of
an operator, the bracket surrounding an argument list, or # signals a potential
restoration point; the environment to be saved is the one BEFORE the invocation
to its left.  (One-character lookahead before invocation is prudent.)
- Environments are NOT restored at the end of sequences.
- The environment at the end of a node is the final value of the Outer
component of the node's environment; unless the node contains persistent (:=)
bindings, this will be the same as the environment just prior to the node.

Coding tricks for common structures: 
- Attributes that tend to persist across boundaries in the dominant structure (e.g.,
Bravo character looks, which persist from paragraph to paragraph unless
explicitly changed by the user) should generally be represented by VAR (:)
bindings at a high-level node in the script; changes would then be represented
by persistent (:=) bindings at the points where they occur.
- Persistent bindings can also be used to record changes to "running" information
(e.g., section number, figure number).
- Information (e.g., marks) repeated in all subnodes of a node should be bound to
Sub, and hence associated with the subnodes implicitly.
- Very compact encodings can often be obtained by using single-character id's
for common sequences of items; each editor can have its own set, tuned to its
own usages.  When reading a script produced by a different editor, it will be
necessary to know the bindings for these abbreviations.


			CONSCIOUSLY POSTPONED

Lambda expressions.

Sets (Cf. Mitchell's Font example.)
	SET/LIST operators ($append $union ?)

Extend selection to CASE?

-------------------
Open questions:
	We should rethink our character assignments.
		check our characterset for disjointness with
			Interpress.DoubtfulChars.
		enlarge op with a few more single-character operators?  %, &, \
	Which way to present semantics?
	Semantics of C<sequence>?

Not done:

-------------------

				ACKNOWLEDGEMENTS

Interpress, Interdoc working group, Lampson, Guttag, Donahue, others?


					APPENDICES
	ALTERNATE PRESENTATIONS OF THE SEMANTIC EQUATIONS

1. GRAMMATICAL FEATURE X SEMANTIC FUNCTION MATRIX

FEATURES:			       FUNCTIONS:	R C	B  M	L S	T
term ::= primary op term				+  =	  	  		
primary ::= literal					==	  	  	
invocation ::= id					+  	+  	  	
invocation ::= name "." id 				+  	+  	  	
universal ::= "$" name				==	  	  	
application ::= invocation "[" item* "]"		+  	  	  	
selection ::= "(" term "|" item1* "|" item2* ")"	+  	+  	  	
node ::= "{" item* "}"				+  =	+  	  	
sequence ::= "(" item* ")"				+  =	+  +	+  +	+
item* ::= item1 item*				+  +	+  +	+  +	+
binding ::= name mode rhs				  	+  	  	
rhs ::= "'" item* "'"					+  	  	  	
rhs ::= "[|" binding* "]"				+  	  	  	
rhs ::= "[" invocation "|" binding* "]"		+  	  	  	
mark ::= invocation "#"				+  	  +	  	
link ::= id "@!"					=	  	+  	
link ::= name "@"					=	  	  +	
link ::= name "!"					=	  	  	+

 Semantic function produces Nil or E or does not apply.
+ Non-trivial semantic equation.
=For R: passes value unchanged; for C: value same as R.


2. PRESENTATION BY FEATURE

term ::= primary op term
	R = C = R<primary>(E) op R<term>(E)
	B = E
	M = L = S = T = Nil

primary ::= literal
	R = C = literal
	B = E
	M = L = S = T = Nil

invocation ::= id
	R = R<valOf(id, E)>(E)
	B = B<valOf(id, E)>(E)

invocation ::= name "." id 
	R = R<valOf(id, R<name>(E))>(E)
	B = B<valOf(id, R<name>(E))>(E)

universal ::= "$" name
	R = C = "$" name
	B = E
	M = L = S = T = Nil

application ::= invocation "[" item* "]"
	R = apply(invocation, R<item*>(E), E)
	B = E

selection ::= "(" term "|" item1* "|" item2* ")"
	R = if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	B = if R<term>(E) then B<item1*>(E) else B<item2*>(E)

node ::= "{" item* "}"
	R = C = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	B = locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	M = L = S = T = Nil

sequence ::= "(" item* ")"
	R = C = "(" R<item*>(E) ")"	?? C = "(" C<item*> ")"
	B = B<item*>(E)
	M = M<item*>
	L = L<item*>
	S = S<item*>
	T = T<item*>

item* ::= ""
	R = C = M = L = S = T = Nil
	B = E

item* ::= item1 item*
	R = R<item1>(E) R<item*>(B<item1>(E))
	B = B<item*>(B<item1>(E))
	C = C<item1> C<item*>
	M = M<item1> M<item*>
	L = L<item1> L<item*>
	S = S<item1> S<item*>
	T = T<item1> T<item*>

binding ::= name mode rhs
	R = Nil
	B = bind(name, mode, R<rhs>(E), E)

binding ::= name mode op term = <name mode name op term>	-- Syntactic sugar

rhs ::= "'" item* "'"
	R = item*

rhs ::= "[|" binding* "]"
	R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]

rhs ::= "[" invocation "|" binding* "]"
	R =[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]

mark ::= invocation "#"
	R = R<invocation>(E) "#"
	B = E
	C = L = S = T = Nil
	M = invocation

link ::= id "@!"
	R = id "@!"
	B = E
	C = M = S = T = Nil
	L = id

link ::= name "@"
	R = name "@"
	B = E
	C = M = L = S = Nil
	T = name

link ::= name "!"
	R = name "!"
	B = E
	C = M = L = T = Nil
	S = prefixes(name)


3. PRESENTATION BY FUNCTION

R: expression, environment > expression			-- Reduction

R<construct>(E) = CASE construct OF
	<primary op term>	=> R<primary>(E) op R<term>(E)
	<literal>		=> literal
	<id>			=> R<valOf(id, E)>(E)
	<name "." id>	=> R<valOf(id, R<name>(E))>(E)
	<"$" name>		=> "$" name
	<invocation "[" item* "]"> => apply(invocation, R<item*>(E), E)
	<"(" term "|" item1* "|" item2* ")"> =>
			if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	<"{" item* "}">	=> "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	<"(" item* ")">	=> "(" R<item*>(E) ")"
	<>			=> Nil
	<item1 item*>	=> R<item1>(E) R<item*>(B<item1>(E))
	<name mode rhs>	=> Nil
	<"'" item* "'">	=> item*
	<"[|" binding* "]">	=>
			[B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
	<"[" invocation "|" binding* "]"> =>
		[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]
	<invocation "#">	=> R<invocation>(E) "#"
	<link>		=> link

B: expression, environment > environment			-- Bindings

B<construct>(E) = CASE construct OF
	<id>			=> B<valOf(id, E)>(E)
	<name "." id>	=> B<valOf(id, R<name>(E))>(E)
	<"(" term "|" item1* "|" item2* ")">(E) =>
			if R<term>(E) then B<item1*>(E) else B<item2*>(E)
	<"{" item* "}">	=>
			locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	<"(" item* ")">	=> B<item*>(E)
	<item1 item*>	=> B<item*>(B<item1>(E))
	<name mode rhs>	=> bind(name, mode, R<rhs>(E), E)
	ELSE			=> E

M: expression > expression			-- Sequence of marks

M<construct> = CASE construct OF
	<name "#">		=> name
	<"(" item* ")">	=> M<item*>
	<item1 item*>	=> M<item1> M<item*>
	ELSE			=> Nil

L: expression > expression			-- Sequence of links

L<construct> = CASE construct OF
	<id "@!">		=> id
	<"(" item* ")">	=> L<item*>
	<item1 item*>	=> L<item1> L<item*>
	ELSE			=> Nil

S: expression > expression			-- Sequence of source links

S<construct> = CASE construct OF
	<name "!">		=> prefixes(name)
	<"(" item* ")">	=> S<item*>
	<item1 item*>	=> S<item1> S<item*>
	ELSE			=> Nil

T: expression > expression			-- Sequence of target links

T<construct> = CASE construct OF
	<name "@">		=> name
	<"(" item* ")">	=> T<item*>
	<item1 item*>	=> T<item1> T<item*>
	ELSE			=> Nil

C: expression > expression			-- Content of node

C<construct> = CASE construct OF
	<value>	=> value
	<label>	=> Nil

*start*
24976 00024 USt
Date: 8 Sept. 1981 4:17 pm PDT (Tuesday)
From: Horning.pa
Subject: Interdoc Level 0/1 DRAFT/1 and notes
To: Mitchell, Horning

[Based on "Interdoc Level 0/1 DRAFT and notes" of 8 Sept. 1981 2:10 pm PDT
(Tuesday).]
[Circulation to Interdoc↑ planned for Wednesday morning, 9 Sept. 1981.]

				STANDARD CARDS

1. WE ARE DESIGNING A STANDARD FOR INTERCHANGE, NOT EDITING.

2. EVERY SCRIPT IS PARSED BEFORE BEING INTERPRETED.

3. STANDARDIZE CONCEPTS, NOT NAMES.

4. COMBINING DOCUMENTS IS AN EDITOR, NOT AN INTERCHANGE, TASK.

5. GENSYM IS AN EDITOR FUNCTION.

6. CHARACTER-SET AND LEXICAL ISSUES CAN BE RESOLVED AT THE END.

7. GET THE TECHNICAL BASIS RIGHT BEFORE WORRYING ABOUT THE
POLITICAL ISSUES.


-------------------

We envision an Interdoc script being input and viewed in any manner
equivalent to the following:

Parse the entire script, repeatedly
- reducing each expression to its "dominant structure," containing only literals
and structural delimiters, by replacing identifiers by the values to which they
are bound in the current environment, by applying operators, by evaluating
selections, and by removing binding items,
- determining the properties of each node from its marks,
- recording the sources and targets for all links, and
- transforming the environment as indicated by the binding items (recording the
relevant attributes for each node in a form convenient to the particular editor).


				BASIC INTERDOC

LEXICAL STRUCTURE AND EXAMPLES
-- allow blank, CR, comma, semicolon as equivalent separators???
-- largely pirate grammar of literals from Interpress

id		::= letter ( letter | digit )*

Expressions in an Interdoc script may denote
	literal values:
		Booleans: (F, T)
		integers: ... -3, -2, -1, 0, 1, 2, 3, ...
		reals: 1.2E5, . . .
		strings: <this is a string>
		marks: Xerox.Laurel.Message#
		links: A123@!, Paragraph.Example!, anId@
		universal names: $Authority.Sub.id
	nodes
	sequences of values
	unevaluated expressions
	environments

SYNTACTIC EXAMPLE

{Book.example!		      -- Links to this from Book@ and Book.example@
ExampleParagraph				-- Invokes a definition
$UniqueMark12356#			-- Adds a mark
Font←[Font | size←10*pt; face←bold]
factorial←'(LT[Value, 2] | 1 | Value*factorial(Value-1))'
a:='NOT[EQ[margins.left, factorial[5]]]' margins.right←100; r=12.5*pt
(a | margins.left←+5; margins.right←5 | margins.left+←10) -- conditional: Algol68
<text for this node>
}

-- Or, using short identifiers, and omitting comments and unnecessary spaces:

{b.e!ep;um#f←[f|s←10*pt;fc←bo]fa←'(l[Value,2]|1|Value*fa(Value-1))'
a:='n[e[m.l,fa[5]]]'m.r←100;r=12.5*pt(a|m.l←+5;m.r←5|m.l+←10)<text for this node>}

GRAMMAR

script		::= versionId item
item		::= value | binding | label
value		::= term | node | sequence
term		::= primary | primary op term
op		::= "+" | "" | "*" | "/"
primary	::= literal | invocation | application | selection
literal		::= Boolean | integer | hexint | real | string
invocation	::= name | universal
name		::= id ( "." id )*
universal	::= "$" name
application	::= invocation "[" item* "]"
selection	::= "(" term "|" item* "|" item* ")"
node		::= "{" item* "}"
sequence	::= "(" ( value | binding )* ")"
binding	::= name mode rhs
mode		::= "=" | ":" | ":=" | "←"
rhs		::= value | op term | "'" item* "'" | "[" [ invocation ] "|" binding* "]"
label		::= mark | link
mark		::= invocation "#"
link		::= id "@!" | name "@" | name "!"

SEMANTIC FUNCTIONS

R: expression, environment > expression			-- Reduction
C: expression > expression					-- Contents
B: expression, environment > environment			-- Bindings
M: expression > expression					-- Marks
L: expression > expression					-- Links
S: expression > expression					-- Link sources
T: expression > expression					-- Link targets

GRAMMATICAL FEATURE X SEMANTIC FUNCTION MATRIX

FEATURES:			       FUNCTIONS:	R C	B  M	L S	T
term ::= primary op term				+  =	  	  		
primary ::= literal					==	  	  	
invocation ::= id					+  	+  	  	
invocation ::= name "." id 				+  	+  	  	
universal ::= "$" name				==	  	  	
application ::= invocation "[" item* "]"		+  	  	  	
selection ::= "(" term "|" item1* "|" item2* ")"	+  	+  	  	
node ::= "{" item* "}"				+  =	+  	  	
sequence ::= "(" ( value | binding )* ")"		+  =	+  	  	
item* ::= item1 item*				+  +	+  +	+  +	+
binding ::= name mode rhs				  	+  	  	
rhs ::= "'" item* "'"					+  	  	  	
rhs ::= "[|" binding* "]"				+  	  	  	
rhs ::= "[" invocation "|" binding* "]"		+  	  	  	
mark ::= invocation "#"				+  	  +	  	
link ::= id "@!"					=	  	+  	
link ::= name "@"					=	  	  +	
link ::= name "!"					=	  	  	+

 Semantic function produces Nil or E or does not apply.
+ Non-trivial semantic equation.
=For R: passes value unchanged; for C: value same as R.

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
	Null denotes the "empty" environment
	[E | id m e] means "E with id mode m bound to e"
	locBinding(id, E) denotes the binding mode of id in E
		locBinding(id, Null) = None
		locBinding(id, [E | id' m e]) =
			if id=id' then m else locBinding(id, E)
	locVal(id, E) denotes the value locally bound to id in E
		locVal(id, Null) = Nil = ""
		locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

PRESENTATION BY FEATURE

[E is used to represent the value of the environment in which the feature
occurs.]

term ::= primary op term
op ::= "+" | "" | "*" | "/"
	R = C = R<primary>(E) op R<term>(E)
	B = E
	M = L = S = T = Nil
-- Both the primary and the term must reduce to numbers; the arithmetic
operators are evaluated right-to-left (a la APL, without precedence) and bind less
tightly than application.

primary ::= literal
literal ::= Boolean | integer | hexint | real | string
	R = C = literal
	B = E
	M = L = S = T = Nil
-- The basic contents of a document.

invocation ::= id
	R = R<valOf(id, E)>(E)
	B = B<valOf(id, E)>(E)
    where
	valOf(id, E) = locVal(id, whereBound(id, E))	-- Gets innermost value
	whereBound(id, E) = CASE			-- Gets innermost binding
		locBinding(id, E) ~= None	=> E
		locBinding("Outer", E) ~= None	=>
						whereBound(id, locVal("Outer", E))
		True				=> Null
-- Both attributes and definitions are looked up in the current environment;
depending on the current binding of id, this may produce values and/or
bindings; if the binding's rhs was quoted, the expression is evaluated at the
point of invocation.

invocation ::= name "." id 
	R = R<valOf(id, R<name>(E))>(E)
	B = B<valOf(id, R<name>(E))>(E)
-- Qualified names are treated as "nested" environments.

universal ::= "$" name
	R = C = "$" name
	B = E
	M = L = S = T = Nil
-- Names prefixed with a $ are presumed to be directly meaningful, and are not
looked up in the environment.

application ::= invocation "[" item* "]"
	R = apply(invocation, R<item*>(E), E)
	B = E
    where
	apply(invocation, value*, E) =
	    CASE R<invocation>(E) OF
		"$equal"	=> value1 = value2
		"$greater"	=> value1 > value2
		. . .
		"$subscript"	=> value1[value2]	-- value1: sequence, value2: int
		"$contents"	=> "(" C<inner(value1)> ")"
		"$marks"	=> "(" M<inner(value1)> ")"
		"$links"	=> "(" L<inner(value1)> ")"
		"$sources"	=> "(" S<inner(value1)> ")"
		"$targets"	=> "(" T<inner(value1)> ")"
	    ELSE   => R<invocation>([[Null | "Outer" "=" E] | "Value" "=" value*])
	inner("{" value* "}") = value*
-- If the invocation does not evaluate to one of the standard external function
names, the current environment is augmented with a binding of the value of the
argument list to the identifier Value, and the value is the result of the
invocation in that environment; this allows function definition within the
language.

selection ::= "(" term "|" item1* "|" item2* ")"
	R = if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	B = if R<term>(E) then B<item1*>(E) else B<item2*>(E)
-- This is a standard conditional expression/binding, using syntax borrowed from
Algol 68.

node ::= "{" item* "}"
	R = C = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	B = locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	M = L = S = T = Nil
-- Nodes have nested environments, and affect the containing environment only
through persistent (:=) bindings to ids with outer VAR (:) bindings.  The items
of a node are implicitly prefixed with the id Sub, which may be bound to any
information intended to be common to all subnodes in a scope.

sequence ::= "(" item* ")"
	R = C = "(" R<item*>(E) ")"
	B = B<item*>(E)
	M = L = S = T = Nil
-- Parentheses group a sequence of items as a single value; bindings in the
sequence affect the environment of items to the right in the containing node,
but labels are disallowed.

item* ::= ""
	R = C = M = L = S = T = Nil
	B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.

item* ::= item1 item*
	R = R<item1>(E) R<item*>(B<item1>(E))
	B = B<item*>(B<item1>(E))
    For F in {C, M, L, S, T}:
	F = F<item1> F<item*>
-- In general, the value of a sequence of items is just the sequence of item
values; binding items affect the environment of items to their right; Nil does not
change the length of a result sequence.

binding ::= name mode rhs
	R = Nil
	B = bind(name, mode, R<rhs>(E), E)
    where
	bind(id, mode, value, E) = CASE
		bindingOf(id, E) = "="	=> E		-- Can't rebind constants
		mode = ":=" 			=> assign(id, value, E)
		True				=> [E | id mode value]
	bind(id "." name, mode, value, E) =
		[E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]
	bindingOf(id, E) = locBinding(id, whereBound(id, E))
	assign(id, value, E) = CASE
		locBinding(id, E) = ":"	=> [E | id ":" value]
		bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
		True				=> E 		-- Can only assign to vars
-- This adds a single binding to E; bindings have no other "side effects" and no
value.

binding ::= name mode op term
	= <name mode name op term>
-- This is just a convenient piece of syntactic sugar for the common case of
updating a binding.

rhs ::= "'" item* "'"
	R = item*
-- A quoted rhs is evaluated at the point of invocation, rather than the point of
binding.

rhs ::= "[|" binding* "]"
	R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that may be used much like a record.

rhs ::= "[" invocation "|" binding* "]"
	R =[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that is an extension of an existing one.

mark ::= invocation "#"
	R = R<invocation>(E) "#"
	M = invocation
	B = E
	C = L = S = T = Nil
-- This gives the containing node the property denoted by the mark to which
the invocation reduces.

link ::= id "@!"
	R = id "@!"
	L = id
	B = E
	C = M = S = T = Nil
-- This defines the scope of the set of links whose "main" component is id.

link ::= name "@"
	R = name "@"
	S = name
	B = E
	C = M = L = T = Nil
-- This identifies the containing node as a "source" of the link name.

link ::= name "!"
	R = name "!"
	T = prefixes(name)
	B = E
	C = M = L = S = Nil
    where
	prefixes(id) = id
	prefixes(name "." id) = name "." id prefixes(name)
-- This identifies the containing node as a "target" of each of the links that is a
prefix of name.



DISCUSSION

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name $id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (marks and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, , *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
	( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.


				OTHER NOTES

Conservative rules for editor treatment of Interdoc subtrees created by other
editors:
-It's OK to display a node if you understand at least one of its marks.
-It's OK to edit a node if you understand ALL of its (local) marks, and either
don't remove any of them or also understand ALL marks of its parent.
-It's OK to copy a node if that doesn't move any labels outside their scope, and
you understand ALL marks of its new parent.
-it's OK to delete a (subtree rooted at a) node if you understand ALL marks of its
parent.
 More work needed on definition vs. attribute, direct vs. indirect contents. 

The "view" of the dominant structure is ALWAYS controlled by the marks on its
nodes.  (E.g., numbers are often part of the content for purposes other than to be
"shown.")  $hidden# means that a node is intended not to be viewed at the point
where it occurs in the dominant structure; such nodes will generally be targets
of links, placed remotely from their sources (uses) to extract attributes from an
environment (e.g., page or figure numbers for references).

We will use the term "property" to refer to a characteristic associated with
a node by means of a label (mark or link) and "attribute" to refer to a
characteristic whose value is determined by the environment.  Properties are
either present (locally) or absent, whereas attributes commonly have values that
apply throughout a scope.

Level 2 of the standard will be primarily concerned with the definition of
standard properties that are expected to be shared among editors.  For each
standard property, it will describe
	- the associated universal mark that denotes it,
	- the assumptions that this mark implies about the contents of a node
		(values that must/may be present and their intended intepretation),
	- the assumptions that this mark implies about the environment of a node
		(attributes that must be present and their intended intepretation).

Attributes are "relevant" to a node if they are assumed by any of its marks.
In general, a node's environment will also contain bindings for many "latent"
attributes that are either relevant to its ancestors (and inherited by default) or
are potentially relevant to its descendants.  A "definition" is an attribute that is
relevant precisely to those nodes that invoke it.

 We need some further thought (and syntax and semantics) for how to associate
assumptions with marks. 

Attributes are inherited only via environments following the dominant
structure.  Thus the choice of a dominant structure to represent scripts from a
particular editor will be strongly influenced by expectations about inheritance. 

Put a value in contents if:			Put a value in environment if:
	effect is local to node			has scope
	is directly edited				is only indirectly edited
	is to be bound locally			needs delayed or global binding

The presentation of this material could be clarified by a table that relates
constructions in the notation to their intended uses and meanings.


				EDITOR LEVELS

Just as with Interpress, the Interdoc standard will apply to editors with varying
capabilities, and it will be important to define some structure to the space of
possibilities.  Dimensions in which we foresee reasonable variation are:

- Invocation/Application: Only built-in definitions - definitions in script.

- Dominant structure: Nodes cannot nest - nesting allowed.

- Other structure: No links - links allowed.

- Bindings: Local only - CONST(=), VAR(:), and persistent (:=).

- Selection: No conditionals - conditionals.

- Numbers: Integers only - floating point.


			HINTS TO IMPLEMENTORS

Scopes: The scope of a binding always begins (textually) at the point where the
binding appears, and extends strictly to its right.  This should simplify the
processing of environments in a single left-to-right scan.  Constant bindings are
unchangeable within a scope.

Environment restoration: Some constructs in the language are not allowed to have
a "net effect" on the environment, even though they may contain internal
bindings that will be in effect for some or all of their substructures.  Thus, a
program to process Interdoc scripts must make provision for restoration of saved
environments.  The amount of bookkeeping required to do this can be minimized
by the following observations:
- A rhs always produces a value (which may be an environment), which is
then bound to a name in the environment that obtained when the binding was
encountered.  Thus, occurrence of a bindingMode (=, :, :=, ←) signals a potential
restoration point.
- Any application (but not every invocation), any term containing infix ops,
and any mark will leave the environment unchanged.  Thus the appearance of
an operator, the bracket surrounding an argument list, or # signals a potential
restoration point; the environment to be saved is the one BEFORE the invocation
to its left.  (One-character lookahead before invocation is prudent.)
- Environments are NOT restored at the end of sequences.
- The environment at the end of a node is the final value of the Outer
component of the node's environment; unless the node contains persistent (:=)
bindings, this will be the same as the environment just prior to the node.

Coding tricks for common structures: 
- Attributes that tend to persist across boundaries in the dominant structure (e.g.,
Bravo character looks, which persist from paragraph to paragraph unless
explicitly changed by the user) should generally be represented by VAR (:)
bindings at a high-level node in the script; changes would then be represented
by persistent (:=) bindings at the points where they occur.
- Persistent bindings can also be used to record changes to "running" information
(e.g., section number, figure number).
- Information (e.g., marks) repeated in all subnodes of a node should be bound to
Sub, and hence associated with the subnodes implicitly.
- Very compact encodings can often be obtained by using single-character id's
for common sequences of items; each editor can have its own set, tuned to its
own usages.  When reading a script produced by a different editor, it will be
necessary to know the bindings for these abbreviations.
- Common editing operations may be represented by definitions with short names
(e.g., Look Keep 40 in Bravo might become "L.K[40]").


			CONSCIOUSLY POSTPONED

Lambda expressions.

Sets (Cf. Mitchell's Font example.)
	SET/LIST operators ($append $union ?)

Extend selection to CASE?

-------------------
Open questions:
	We should rethink our character assignments.
		check our characterset for disjointness with
			Interpress.DoubtfulChars.
		enlarge op with a few more single-character operators?  %, &, \

Not done:

-------------------

				ACKNOWLEDGEMENTS

Interpress, Interdoc working group, Lampson, Guttag, Donahue, others?


					APPENDICES
	ALTERNATE PRESENTATIONS OF THE SEMANTIC EQUATIONS

PRESENTATION BY FUNCTION

R: expression, environment > expression			-- Reduction

R<construct>(E) = CASE <construct> OF
	<primary op term>	=> R<primary>(E) op R<term>(E)
	<literal>		=> literal
	<id>			=> R<valOf(id, E)>(E)
	<name "." id>	=> R<valOf(id, R<name>(E))>(E)
	<"$" name>		=> "$" name
	<invocation "[" item* "]"> => apply(invocation, R<item*>(E), E)
	<"(" term "|" item1* "|" item2* ")"> =>
			if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	<"{" item* "}">	=> "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	<"(" item* ")">	=> "(" R<item*>(E) ")"
	<>			=> Nil
	<item1 item*>	=> R<item1>(E) R<item*>(B<item1>(E))
	<name mode rhs>	=> Nil
	<"'" item* "'">	=> item*
	<"[|" binding* "]">	=>
			[B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
	<"[" invocation "|" binding* "]"> =>
		[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]
	<invocation "#">	=> R<invocation>(E) "#"
	<link>		=> link

C: expression > expression			-- Content of node

C<construct> = CASE <construct> OF
	<value>		=> value
	<label>		=> Nil
	<item1 item*>	=> C<item1> C<item*>

B: expression, environment > environment			-- Bindings

B<construct>(E) = CASE <construct> OF
	<id>			=> B<valOf(id, E)>(E)
	<name "." id>	=> B<valOf(id, R<name>(E))>(E)
	<"(" term "|" item1* "|" item2* ")">(E) =>
			if R<term>(E) then B<item1*>(E) else B<item2*>(E)
	<"{" item* "}">	=>
			locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	<"(" item* ")">	=> B<item*>(E)
	<item1 item*>	=> B<item*>(B<item1>(E))
	<name mode rhs>	=> bind(name, mode, R<rhs>(E), E)
	ELSE			=> E

M: expression > expression			-- Sequence of marks

M<construct> = CASE <construct> OF
	<name "#">		=> name
	<item1 item*>	=> M<item1> M<item*>
	ELSE			=> Nil

L: expression > expression			-- Sequence of links

L<construct> = CASE <construct> OF
	<id "@!">		=> id
	<item1 item*>	=> L<item1> L<item*>
	ELSE			=> Nil

S: expression > expression			-- Sequence of source links

S<construct> = CASE <construct> OF
	<name "!">		=> prefixes(name)
	<item1 item*>	=> S<item1> S<item*>
	ELSE			=> Nil

T: expression > expression			-- Sequence of target links

T<construct> = CASE <construct> OF
	<name "@">		=> name
	<item1 item*>	=> T<item1> T<item*>
	ELSE			=> Nil


SEMANTICS: REDUCTION AND BINDING

R: expression, environment > expression			-- Reduction
B: expression, environment > environment			-- Bindings
R&B<construct>(E) denotes the pair  R<construct>(E); B<construct>(E)

[Unless explicitly given below, B<construct>(E) = E.]

R<primary op term>(E) = R<primary>(E) op R<term>(E)

R<literal>(E) = literal

R&B<id>(E) = R&B<valOf(id, E)>(E)

R&B<name "." id>(E) = R&B<valOf(id, R<name>(E))>(E)

R<"$" name>(E) = "$" name

R<invocation "[" item* "]">(E) = apply(invocation, R<item*>(E), E)

R&B<"(" term "|" item1* "|" item2* ")">(E) =
			if R<term>(E) then R&B<item1*>(E) else R&B<item2*>(E)

R&B<"{" item* "}">(E) = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}";
	      locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))

R&B<"(" item* ")">(E) = "(" R<item*>(E) ")" ; B<item*>(E)

R<>(E) = Nil

R&B<item1 item*>(E) = R<item1>(E) R<item*>(B<item1>(E));
					  B<item*>(B<item1>(E))

R&B<name mode rhs>(E) = Nil; bind(name, mode, R<rhs>(E), E)
	<name mode op term> = <name mode name op term>	-- Syntactic sugar

R<"'" item* "'">(E) = item*			--Usable only in rhs of binding

R<"[|" binding* "]">(E) = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]

R<"[" invocation "|" binding* "]">(E) =
	[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]

R<invocation "#">(E) = R<invocation>(E) "#"

R<link>(E) = link


SEMANTICS OF CONTENTS AND LABELS

C: expression > expression			-- Contents of node
M: expression > expression			-- Sequence of marks
L: expression > expression			-- Sequence of links
S: expression > expression			-- Sequence of source links
T: expression > expression			-- Sequence of target links

[These functions all return the empty string, Nil (which does not lengthen a
sequence), except as specified below.]

C<value> = value
C<label> = Nil
C<item1 item*> = C<item1> C<item*>

M<name "#"> = name
M<item1 item*> = M<item1> M<item*>

L<id "@!"> = id
L<item1 item*> = L<item1> L<item*>

S<name "!"> = prefixes(name)
S<item1 item*> = S<item1> S<item*>

T<name "@"> = name
T<item1 item*> = T<item1> T<item*>

prefixes(id) = id
prefixes(name "." id) = name "." id prefixes(name)


*start*
25613 00024 USt
Date: 9 Sept. 1981 3:04 pm PDT (Wednesday)
From: Horning.pa
Subject: Interdoc Level 0/1 DRAFT/2 and notes
To: Interdoc

[Sorry for the long delay between updates.  Jim M. and I have been receiving all
the feedback we could handle from Butler and John Guttag.  However, we think
things have now stabilized quite a bit.  This message focusses on things that
have changed since "last time."  Jim M. is working on expanding an outline for a
more complete manual along the lines of the Interpress report; most of these
pieces will fit in there somewhere.--Jim H.]

				STANDARD CARDS

1. WE ARE DESIGNING A STANDARD FOR INTERCHANGE, NOT EDITING.

2. EVERY SCRIPT IS PARSED BEFORE BEING INTERPRETED.

3. STANDARDIZE CONCEPTS, NOT NAMES.

4. COMBINING DOCUMENTS IS AN EDITOR, NOT AN INTERCHANGE, TASK.

5. GENSYM IS AN EDITOR FUNCTION.

6. CHARACTER-SET AND LEXICAL ISSUES CAN BE RESOLVED AT THE END.

7. GET THE TECHNICAL BASIS RIGHT BEFORE WORRYING ABOUT THE
POLITICAL ISSUES.


-------------------

We envision an Interdoc script being input and viewed in any manner
equivalent to the following:

Parse the entire script, repeatedly
- reducing each expression to its "dominant structure," containing only literals
and structural delimiters, by replacing identifiers by the values to which they
are bound in the current environment, by applying operators, by evaluating
selections, and by removing binding items,
- determining the properties of each node from its marks,
- recording the sources and targets for all links, and
- transforming the environment as indicated by the binding items (recording the
relevant attributes for each node in a form convenient to the particular editor).


				BASIC INTERDOC

LEXICAL STRUCTURE AND EXAMPLES
 Allow blank, CR, comma, semicolon as equivalent separators???
 Largely pirate grammar of literals from Interpress

id		::= letter ( letter | digit )*

Expressions in an Interdoc script may denote
	literal values:
		Booleans: (F, T)
		integers: ... 3, 2, 1, 0, 1, 2, 3, ...
		reals: 1.2E5, . . .
		strings: <this is a string>
		marks: Xerox.Laurel.Message#
		links: A123@!, Paragraph.Example!, anId@
		universal names: $Authority.Sub.id
	nodes
	sequences of values
	unevaluated expressions
	environments

SYNTACTIC EXAMPLE

{Book.example!		      -- Links to this from Book@ and Book.example@
ExampleParagraph				-- Invokes a definition
$UniqueMark12356#			-- Adds a mark
Font←[Font | size←10*pt; face←bold]
factorial←'(LT[Value, 2] | 1 | Value*factorial(Value-1))'
a:='NOT[EQ[margins.left, factorial[5]]]' margins.right←100; r=12.5*pt
(a | margins.left←+5; margins.right←5 | margins.left+←10) -- conditional: Algol68
<text for this node>
}

-- Or, using short identifiers, and omitting comments and unnecessary spaces:

{b.e!ep;um#f←[f|s←10*pt;fc←bo]fa←'(l[Value,2]|1|Value*fa(Value-1))'
a:='n[e[m.l,fa[5]]]'m.r←100;r=12.5*pt(a|m.l←+5;m.r←5|m.l+←10)<text for this node>}

GRAMMAR

script		::= versionId item
item		::= value | binding | label
value		::= term | node
term		::= primary | primary op term
op		::= "+" | "" | "*" | "/"
primary	::= literal | invocation | application | selection | sequence
literal		::= Boolean | integer | hexint | real | string
invocation	::= name | universal
name		::= id ( "." id )*
universal	::= "$" name
application	::= invocation "[" item* "]"
selection	::= "(" term "|" item* "|" item* ")"
sequence	::= "(" ( value | binding )* ")"
node		::= "{" item* "}"
binding	::= name mode rhs
mode		::= "=" | ":" | ":=" | "←"
rhs		::= value | op term | "'" item* "'" | "[" [ invocation ] "|" binding* "]"
label		::= mark | link
mark		::= invocation "#"
link		::= id "@!" | name "@" | name "!"

SEMANTIC FUNCTIONS

R: expression, environment > expression			-- Reduction
C: expression > expression					-- Contents
B: expression, environment > environment			-- Bindings
M: expression > expression					-- Marks
L: expression > expression					-- Links
S: expression > expression					-- Link sources
T: expression > expression					-- Link targets

GRAMMATICAL FEATURE X SEMANTIC FUNCTION MATRIX

FEATURES:			       FUNCTIONS:	R C	B  M	L S	T
term ::= primary op term				+  =	  	  		
primary ::= literal					==	  	  	
invocation ::= id					+  	+  	  	
invocation ::= name "." id 				+  	+  	  	
universal ::= "$" name				==	  	  	
application ::= invocation "[" item* "]"		+  	  	  	
selection ::= "(" term "|" item1* "|" item2* ")"	+  	+  	  	
node ::= "{" item* "}"				+  =	+  	  	
sequence ::= "(" ( value | binding )* ")"		+  =	+  	  	
item* ::= item1 item*				+  +	+  +	+  +	+
binding ::= name mode rhs				  	+  	  	
rhs ::= "'" item* "'"					+  	  	  	
rhs ::= "[|" binding* "]"				+  	  	  	
rhs ::= "[" invocation "|" binding* "]"		+  	  	  	
mark ::= invocation "#"				+  	  +	  	
link ::= id "@!"					=	  	+  	
link ::= name "@"					=	  	  +	
link ::= name "!"					=	  	  	+

 Semantic function produces Nil or E or does not apply.
+ Non-trivial semantic equation.
=For R: passes value unchanged; for C: value same as R.

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
	Null denotes the "empty" environment
	[E | id m e] means "E with id mode m bound to e"
	locBinding(id, E) denotes the binding mode of id in E
		locBinding(id, Null) = None
		locBinding(id, [E | id' m e]) =
			if id=id' then m else locBinding(id, E)
	locVal(id, E) denotes the value locally bound to id in E
		locVal(id, Null) = Nil = ""
		locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

PRESENTATION BY FEATURE

[E is used to represent the value of the environment in which the feature
occurs.]

term ::= primary op term
op ::= "+" | "" | "*" | "/"
	R = C = R<primary>(E) op R<term>(E)
	B = E
	M = L = S = T = Nil
-- Both the primary and the term must reduce to numbers; the arithmetic
operators are evaluated right-to-left (a la APL, without precedence) and bind less
tightly than application.

primary ::= literal
literal ::= Boolean | integer | hexint | real | string
	R = C = literal
	B = E
	M = L = S = T = Nil
-- The basic contents of a document.

invocation ::= id
	R = R<valOf(id, E)>(E)
	B = B<valOf(id, E)>(E)
    where
	valOf(id, E) = locVal(id, whereBound(id, E))	-- Gets innermost value
	whereBound(id, E) = CASE			-- Gets innermost binding
		locBinding(id, E) ~= None	=> E
		locBinding("Outer", E) ~= None	=>
						whereBound(id, locVal("Outer", E))
		True				=> Null
-- Both attributes and definitions are looked up in the current environment;
depending on the current binding of id, this may produce values and/or
bindings; if the binding's rhs was quoted, the expression is evaluated at the
point of invocation.

invocation ::= name "." id 
	R = R<valOf(id, R<name>(E))>(E)
	B = B<valOf(id, R<name>(E))>(E)
-- Qualified names are treated as "nested" environments.

universal ::= "$" name
	R = C = "$" name
	B = E
	M = L = S = T = Nil
-- Names prefixed with a $ are presumed to be directly meaningful, and are not
looked up in the environment.

application ::= invocation "[" item* "]"
	R = apply(invocation, R<item*>(E), E)
	B = E
    where
	apply(invocation, value*, E) =
	    CASE R<invocation>(E) OF
		"$equal"	=> value1 = value2
		"$greater"	=> value1 > value2
		. . .
		"$subscript"	=> value1[value2]	-- value1: sequence, value2: int
		"$contents"	=> "(" C<inner(value1)> ")"
		"$marks"	=> "(" M<inner(value1)> ")"
		"$links"	=> "(" L<inner(value1)> ")"
		"$sources"	=> "(" S<inner(value1)> ")"
		"$targets"	=> "(" T<inner(value1)> ")"
	    ELSE   => R<invocation>([[Null | "Outer" "=" E] | "Value" "=" value*])
	inner("{" value* "}") = value*
-- If the invocation does not evaluate to one of the standard external function
names, the current environment is augmented with a binding of the value of the
argument list to the identifier Value, and the value is the result of the
invocation in that environment; this allows function definition within the
language.

selection ::= "(" term "|" item1* "|" item2* ")"
	R = if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	B = if R<term>(E) then B<item1*>(E) else B<item2*>(E)
-- This is a standard conditional expression/binding, using syntax borrowed from
Algol 68.

sequence ::= "(" item* ")"
	R = C = "(" R<item*>(E) ")"
	B = B<item*>(E)
	M = L = S = T = Nil
-- Parentheses group a sequence of items as a single value; bindings in the
sequence affect the environment of items to the right in the containing node,
but labels are disallowed.  Parentheses may also be used to override the
right-to-left evaluation of arithmetic operators; an operand sequence must reduce
to a single numeric value. 

node ::= "{" item* "}"
	R = C = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	B = locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	M = L = S = T = Nil
-- Nodes have nested environments, and affect the containing environment only
through persistent (:=) bindings to ids with outer VAR (:) bindings.  The items
of a node are implicitly prefixed with the id Sub, which may be bound to any
information intended to be common to all subnodes in a scope.

item* ::= ""
	R = C = M = L = S = T = Nil
	B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.

item* ::= item1 item*
	R = R<item1>(E) R<item*>(B<item1>(E))
	B = B<item*>(B<item1>(E))
    For F in {C, M, L, S, T}:
	F = F<item1> F<item*>
-- In general, the value of a sequence of items is just the sequence of item
values; binding items affect the environment of items to their right; Nil does not
change the length of a result sequence.

binding ::= name mode rhs
	R = Nil
	B = bind(name, mode, R<rhs>(E), E)
    where
	bind(id, mode, value, E) = CASE
		bindingOf(id, E) = "="	=> E		-- Can't rebind constants
		mode = ":=" 			=> assign(id, value, E)
		True				=> [E | id mode value]
	bind(id "." name, mode, value, E) =
		[E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]
	bindingOf(id, E) = locBinding(id, whereBound(id, E))
	assign(id, value, E) = CASE
		locBinding(id, E) = ":"	=> [E | id ":" value]
		bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
		True				=> E 		-- Can only assign to vars
-- This adds a single binding to E; bindings have no other "side effects" and no
value.

binding ::= name mode op term
	= <name mode name op term>
-- This is just a convenient piece of syntactic sugar for the common case of
updating a binding.

rhs ::= "'" item* "'"
	R = item*
-- A quoted rhs is evaluated at the point of invocation, rather than the point of
binding.

rhs ::= "[|" binding* "]"
	R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that may be used much like a record.

rhs ::= "[" invocation "|" binding* "]"
	R =[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that is an extension of an existing one.

mark ::= invocation "#"
	R = R<invocation>(E) "#"
	M = invocation
	B = E
	C = L = S = T = Nil
-- This gives the containing node the property denoted by the mark to which
the invocation reduces.

link ::= id "@!"
	R = id "@!"
	L = id
	B = E
	C = M = S = T = Nil
-- This defines the scope of the set of links whose "main" component is id.

link ::= name "@"
	R = name "@"
	S = name
	B = E
	C = M = L = T = Nil
-- This identifies the containing node as a "source" of the link name.

link ::= name "!"
	R = name "!"
	T = prefixes(name)
	B = E
	C = M = L = S = Nil
    where
	prefixes(id) = id
	prefixes(name "." id) = name "." id prefixes(name)
-- This identifies the containing node as a "target" of each of the links that is a
prefix of name.



DISCUSSION

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name $id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (marks and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, , *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
	( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.


				OTHER NOTES

Conservative rules for editor treatment of Interdoc subtrees created by other
editors:
-It's OK to display a node if you understand at least one of its marks.
-It's OK to edit a node if you understand ALL of its (local) marks, and either
don't remove any of them or also understand ALL marks of its parent.
-It's OK to copy a node if that doesn't move any labels outside their scope,
you understand ALL marks of its new parent, and all relevant attributes have
the same binding in the new environment.
-it's OK to delete a (subtree rooted at a) node if you understand ALL marks of its
parent.
-It's OK to change a binding if you understand ALL marks of each node in its
scope, or if it is for a standard attribute that you understand.
 More work needed on definition vs. attribute, direct vs. indirect contents. 

The "view" of the dominant structure is ALWAYS controlled by the marks on its
nodes.  (E.g., numbers are often part of the content for purposes other than to be
"shown.")  $hidden# means that a node is intended not to be viewed at the point
where it occurs in the dominant structure; such nodes will generally be targets
of links, placed remotely from their sources (uses) to extract attributes from an
environment (e.g., page or figure numbers for references).

We will use the term "property" to refer to a characteristic associated with
a node by means of a label (mark or link) and "attribute" to refer to a
characteristic whose value is determined by the environment.  Properties are
either present (locally) or absent, whereas attributes commonly have values that
apply throughout a scope.

Level 2 of the standard will be primarily concerned with the definition of
standard properties that are expected to be shared among editors.  For each
standard property, it will describe
	- the associated universal mark that denotes it,
	- the assumptions that this mark implies about the contents of a node
		(values that must/may be present and their intended intepretation),
	- the assumptions that this mark implies about the environment of a node
		(attributes that must be present and their intended intepretation).

Attributes are "relevant" to a node if they are assumed by any of its marks.
In general, a node's environment will also contain bindings for many "latent"
attributes that are either relevant to its ancestors (and inherited by default) or
are potentially relevant to its descendants.  A "definition" is an attribute that is
relevant precisely to those nodes that invoke it.

 We need some further thought (and syntax and semantics) for how to associate
assumptions with marks. 

Attributes are inherited only via environments following the dominant
structure.  Thus the choice of a dominant structure to represent scripts from a
particular editor will be strongly influenced by expectations about inheritance. 

Put a value in contents if:			Put a value in environment if:
	effect is local to node			has scope
	is directly edited				is only indirectly edited
	is to be bound locally			needs delayed or global binding

The presentation of this material could be clarified by a table that relates
constructions in the notation to their intended uses and meanings.


				EDITOR LEVELS

Just as with Interpress, the Interdoc standard will apply to editors with varying
capabilities, and it will be important to define some structure to the space of
possibilities.  Dimensions in which we foresee reasonable variation are:

- Invocation/Application: Only built-in definitions - definitions in script.

- Dominant structure: Nodes cannot nest - nesting allowed.

- Other structure: No links - links allowed.

- Bindings: Local only - CONST(=), VAR(:), and persistent (:=).

- Selection: No conditionals - conditionals.

- Numbers: Integers only - floating point.


			HINTS TO IMPLEMENTORS

Scopes: The scope of a binding always begins (textually) at the point where the
binding appears, and extends strictly to its right.  This should simplify the
processing of environments in a single left-to-right scan.  Constant bindings are
unchangeable within a scope.

Environment restoration: Some constructs in the language are not allowed to have
a "net effect" on the environment, even though they may contain internal
bindings that will be in effect for some or all of their substructures.  Thus, a
program to process Interdoc scripts must make provision for restoration of saved
environments.  The amount of bookkeeping required to do this can be minimized
by the following observations:
- A rhs always produces a value (which may be an environment), which is
then bound to a name in the environment that obtained when the binding was
encountered.  Thus, occurrence of a bindingMode (=, :, :=, ←) signals a potential
restoration point.
- Any application (but not every invocation), any term containing infix ops,
and any mark will leave the environment unchanged.  Thus the appearance of
an operator, the bracket surrounding an argument list, or # signals a potential
restoration point; the environment to be saved is the one BEFORE the invocation
to its left.  (One-character lookahead before invocation is prudent.)
- Environments are NOT restored at the end of sequences.
- The environment at the end of a node is the final value of the Outer
component of the node's environment; unless the node contains persistent (:=)
bindings, this will be the same as the environment just prior to the node.

Coding tricks for common structures: 
- Attributes that tend to persist across boundaries in the dominant structure (e.g.,
Bravo character looks, which persist from paragraph to paragraph unless
explicitly changed by the user) should generally be represented by VAR (:)
bindings at a high-level node in the script; changes would then be represented
by persistent (:=) bindings at the points where they occur.
- Persistent bindings can also be used to record changes to "running" information
(e.g., section number, figure number).
- Information (e.g., marks) repeated in all subnodes of a node should be bound to
Sub, and hence associated with the subnodes implicitly.
- Very compact encodings can often be obtained by using single-character id's
for common sequences of items; each editor can have its own set, tuned to its
own usages.  When reading a script produced by a different editor, it will be
necessary to know the bindings for these abbreviations.
- Common editing operations may be represented by definitions with short names
(e.g., Look Keep 40 in Bravo might become "L.K[40]").


			CONSCIOUSLY POSTPONED

Lambda expressions.

Sets (Cf. Mitchell's Font example.)
	SET/LIST operators ($append $union ?)

Extend selection to CASE?

-------------------
Open questions:
	We should rethink our character assignments.
		check our characterset for disjointness with
			Interpress.DoubtfulChars.
		enlarge op with a few more single-character operators?  %, &, \

Not done:

-------------------

				ACKNOWLEDGEMENTS

Interpress, Interdoc working group, Lampson, Guttag, Donahue, others?


					APPENDICES
	ALTERNATE PRESENTATIONS OF THE SEMANTIC EQUATIONS

PRESENTATION BY FUNCTION

R: expression, environment > expression			-- Reduction

R<construct>(E) = CASE <construct> OF
	<primary op term>	=> R<primary>(E) op R<term>(E)
	<literal>		=> literal
	<id>			=> R<valOf(id, E)>(E)
	<name "." id>	=> R<valOf(id, R<name>(E))>(E)
	<"$" name>		=> "$" name
	<invocation "[" item* "]"> => apply(invocation, R<item*>(E), E)
	<"(" term "|" item1* "|" item2* ")"> =>
			if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	<"(" item* ")">	=> "(" R<item*>(E) ")"
	<"{" item* "}">	=> "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	<>			=> Nil
	<item1 item*>	=> R<item1>(E) R<item*>(B<item1>(E))
	<name mode rhs>	=> Nil
	<"'" item* "'">	=> item*
	<"[|" binding* "]">	=>
			[B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
	<"[" invocation "|" binding* "]"> =>
		[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]
	<invocation "#">	=> R<invocation>(E) "#"
	<link>		=> link

C: expression > expression			-- Content of node

C<construct> = CASE <construct> OF
	<value>		=> value
	<label>		=> Nil
	<item1 item*>	=> C<item1> C<item*>

B: expression, environment > environment			-- Bindings

B<construct>(E) = CASE <construct> OF
	<id>			=> B<valOf(id, E)>(E)
	<name "." id>	=> B<valOf(id, R<name>(E))>(E)
	<"(" term "|" item1* "|" item2* ")">(E) =>
			if R<term>(E) then B<item1*>(E) else B<item2*>(E)
	<"(" item* ")">	=> B<item*>(E)
	<"{" item* "}">	=>
			locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	<item1 item*>	=> B<item*>(B<item1>(E))
	<name mode rhs>	=> bind(name, mode, R<rhs>(E), E)
	ELSE			=> E

M: expression > expression			-- Sequence of marks

M<construct> = CASE <construct> OF
	<name "#">		=> name
	<item1 item*>	=> M<item1> M<item*>
	ELSE			=> Nil

L: expression > expression			-- Sequence of links

L<construct> = CASE <construct> OF
	<id "@!">		=> id
	<item1 item*>	=> L<item1> L<item*>
	ELSE			=> Nil

S: expression > expression			-- Sequence of source links

S<construct> = CASE <construct> OF
	<name "!">		=> prefixes(name)
	<item1 item*>	=> S<item1> S<item*>
	ELSE			=> Nil

T: expression > expression			-- Sequence of target links

T<construct> = CASE <construct> OF
	<name "@">		=> name
	<item1 item*>	=> T<item1> T<item*>
	ELSE			=> Nil


SEMANTICS: REDUCTION AND BINDING

R: expression, environment > expression			-- Reduction
B: expression, environment > environment			-- Bindings
R&B<construct>(E) denotes the pair  R<construct>(E); B<construct>(E)

[Unless explicitly given below, B<construct>(E) = E.]

R<primary op term>(E) = R<primary>(E) op R<term>(E)

R<literal>(E) = literal

R&B<id>(E) = R&B<valOf(id, E)>(E)

R&B<name "." id>(E) = R&B<valOf(id, R<name>(E))>(E)

R<"$" name>(E) = "$" name

R<invocation "[" item* "]">(E) = apply(invocation, R<item*>(E), E)

R&B<"(" term "|" item1* "|" item2* ")">(E) =
			if R<term>(E) then R&B<item1*>(E) else R&B<item2*>(E)

R&B<"(" item* ")">(E) = "(" R<item*>(E) ")" ; B<item*>(E)

R&B<"{" item* "}">(E) = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}";
	      locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))

R<>(E) = Nil

R&B<item1 item*>(E) = R<item1>(E) R<item*>(B<item1>(E));
					  B<item*>(B<item1>(E))

R&B<name mode rhs>(E) = Nil; bind(name, mode, R<rhs>(E), E)
	<name mode op term> = <name mode name op term>	-- Syntactic sugar

R<"'" item* "'">(E) = item*			--Usable only in rhs of binding

R<"[|" binding* "]">(E) = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]

R<"[" invocation "|" binding* "]">(E) =
	[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]

R<invocation "#">(E) = R<invocation>(E) "#"

R<link>(E) = link


SEMANTICS OF CONTENTS AND LABELS

C: expression > expression			-- Contents of node
M: expression > expression			-- Sequence of marks
L: expression > expression			-- Sequence of links
S: expression > expression			-- Sequence of source links
T: expression > expression			-- Sequence of target links

[These functions all return the empty string, Nil (which does not lengthen a
sequence), except as specified below.]

C<value> = value
C<label> = Nil
C<item1 item*> = C<item1> C<item*>

M<name "#"> = name
M<item1 item*> = M<item1> M<item*>

L<id "@!"> = id
L<item1 item*> = L<item1> L<item*>

S<name "!"> = prefixes(name)
S<item1 item*> = S<item1> S<item*>

T<name "@"> = name
T<item1 item*> = T<item1> T<item*>

prefixes(id) = id
prefixes(name "." id) = name "." id prefixes(name)


*start*
01902 00024 USt
Date: 15 Sept. 1981 10:28 am PDT (Tuesday)
From: Horning.pa
Subject: Notes from our Inter-X discussion yesterday
To: Lampson, Mitchell
cc: Horning

Butler's global message was that the current presentation of the formal semantics
was hard to grasp and confusing.
	It would be a mistake to circulate it at this point and frighten people off.
	We might be able to improve the situation by attempting to isolate a
	smaller semantic base, and explain more of the language as sugarings
		Literal values
		Structured value constructors
			Node
			Environment
			Vector ("Do we really need this?", he asks.)
		Bindings
		Generic operations
			Applications and Invocations
			Selections
		Specific functions
			Arithmetic
			Comparison
			Logical
			Subscript
			...

Butler questioned the desirability of a semantic class rhs, with values that could
only be used in bindings--specifically, quotations and environments.
	A "free-standing" environment as a transformation would give us a way
		to bind a set of rhs values, but delay putting the bindings into the
		environment.
	Rather than quoting some rhs values, we should simply not evaluate an
		rhs unless it is explicitly requested.

The ops should be presented as sugar for applications.

The nonterminal "sequence" should be "vector" (or some such).

The nonterminal "value" should be "content" (or some such).

A mark should be <universal> "#", rather than <invocation> "#".

Parentheses should NOT be used both to form vectors and to override precedence
(use functional notation for the latter).

VAR attributes and persistent bindings should NOT be used simply to record
attributes that tend to persist across boundaries in the dominant structure, but
ONLY to record the intent that something change "running" information.

Butler prefers "Interscript" to either "Interdoc" or "Interedit", although he is
uncertain why.