*start*
14263 00024 USt
Date: 5 March 1982 2:25 pm PST (Friday)
From: Mitchell.PA
Subject: Interdoc syntax and semantics
To: Interdoc.pa
Reply-To: Mitchell.pa

Here are the current versions of the Interdoc syntax and semantics that I
promised to distribute to you all at this morning's meeting.  I have laced the
semantic equations with some explanatory text to help you wade through them,
but I am under no illusions that this has made them easy to read.  Please feel
free to send me questions about them.  Such questions will undoubtedly help us
to clarify the presentation.

Jim M.

-------------------
GRAMMAR

script		::= versionId item
item		::= value | binding | label
value		::= term | node
term		::= primary | primary op term
op		::= "+" | "-" | "*" | "/"
primary	::= literal | invocation | application | selection | sequence
literal		::= Boolean | integer | hexint | real | string
invocation	::= name | universal
name		::= id ( "." id )*
universal	::= "$" name
application	::= invocation "[" item* "]"
selection	::= "(" term "|" item* "|" item* ")"
sequence	::= "(" ( value | binding )* ")"
node		::= "{" item* "}"
binding	::= name mode rhs
mode		::= "=" | ":" | ":=" | "←"
rhs		::= value | op term | "'" item* "'" | "[" [ invocation ] "|" binding* "]"
label		::= mark | link
mark		::= invocation "#"
link		::= id "@!" | name "@" | name "!"

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
	Null denotes the "empty" environment
	[E | id m e] means "E with id mode m bound to e"
	locBinding(id, E) denotes the binding mode of id in E
		locBinding(id, Null) = None
		locBinding(id, [E | id' m e]) =
			if id=id' then m else locBinding(id, E)
	locVal(id, E) denotes the value locally bound to id in E
		locVal(id, Null) = Nil = ""
		locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

SEMANTIC FUNCTIONS

R: expression, environment --> expression			-- Reduction
   R is used for evaluating right-hand sides: identifiers, expressions, etc.

C: expression --> expression					-- Contents
   C is basically used to indicate which evaluated expressions become part of the
   content of a node 

B: expression, environment --> environment			-- Bindings
   B indicates the effect a binding has on an environment.  B and R are
   mutually recursive functions (e.g.,  the evaluation of an expression may
   cause some bindings to occur as well)

The following four semantic functions occur less frequently in any substantive
way in the semantics below.  You might wish to skip them until they occur in a
nontrivial manner in the semantics.

M: expression --> expression					-- Marks
   M indicates when an identifier is to be included in the mark set for a node

L: expression --> expression					-- Links
   L indicates link declarations

S: expression --> expression					-- Link sources
   S indicates a link to the set of nodes having associated target links

T: expression --> expression					-- Link targets
   T indicates that the node is to be included in the target set of all the names
   which are prefixes of the name to which the expression should evaluate

PRESENTATION BY FEATURE

[E is used to represent the value of the environment in which the feature
occurs.]

term ::= primary op term
op ::= "+" | "-" | "*" | "/"
	R = C = R<primary>(E) op R<term>(E)
	B = E
	M = L = S = T = Nil
-- Both the primary and the term must reduce to numbers; the arithmetic
operators are evaluated right-to-left (a la APL, without precedence) and bind less
tightly than application.

primary ::= literal
literal ::= Boolean | integer | hexint | real | string
	R = C = literal
	B = E
	M = L = S = T = Nil
-- The basic contents of a document.

invocation ::= id
	R = R<valOf(id, E)>(E)
	B = B<valOf(id, E)>(E)
    where
	valOf(id, E) = locVal(id, whereBound(id, E))	-- Gets innermost value
	whereBound(id, E) = CASE			-- Gets innermost binding
		locBinding(id, E) ~= None	=> E
		locBinding("Outer", E) ~= None	=>
						whereBound(id, locVal("Outer", E))
		True				=> Null
-- Both attributes and definitions are looked up in the current environment;
depending on the current binding of id, this may produce values and/or
bindings; if the binding's rhs was quoted, the expression is evaluated at the
point of invocation.

-- When an id is referred to and locBinding(id, E)=None, then the value is
sought recursively in locVal("Outer").  The (implicit) "outermost" environment
binds each id to the "universal" name $id.

invocation ::= name "." id 
	R = R<valOf(id, R<name>(E))>(E)
	B = B<valOf(id, R<name>(E))>(E)
-- Qualified names are treated as "nested" environments.

universal ::= "$" name
	R = C = "$" name
	B = E
	M = L = S = T = Nil
-- Names prefixed with a $ are presumed to be directly meaningful, and are not
looked up in the environment.

application ::= invocation "[" item* "]"
	R = apply(invocation, R<item*>(E), E)
	B = E
    where
	apply(invocation, value*, E) =
	    CASE R<invocation>(E) OF
		"$equal"	=> value1 = value2
		"$greater"	=> value1 > value2
		. . .
		"$subscript"	=> value1[value2]	-- value1: sequence, value2: int
		"$contents"	=> "(" C<inner(value1)> ")"
		"$marks"	=> "(" M<inner(value1)> ")"
		"$links"	=> "(" L<inner(value1)> ")"
		"$sources"	=> "(" S<inner(value1)> ")"
		"$targets"	=> "(" T<inner(value1)> ")"
	    ELSE   => R<invocation>([[Null | "Outer" "=" E] | "Value" "=" value*])
	inner("{" value* "}") = value*
-- If the invocation does not evaluate to one of the standard external function
names, the current environment is augmented with a binding of the value of the
argument list to the identifier Value, and the value is the result of the
invocation in that environment; this allows function definition within the
language.

selection ::= "(" term "|" item1* "|" item2* ")"
	R = if R<term>(E) then R<item1*>(E) else R<item2*>(E)
	B = if R<term>(E) then B<item1*>(E) else B<item2*>(E)
-- The notation for selections (conditionals) is borrowed from Algol 68:
	( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 

sequence ::= "(" item* ")"
	R = C = "(" R<item*>(E) ")"
	B = B<item*>(E)
	M = L = S = T = Nil
-- Parentheses group a sequence of items as a single value; bindings in the
sequence affect the environment of items to the right in the containing node,
but labels are disallowed.  Parentheses may also be used to override the
right-to-left evaluation of arithmetic operators; an operand sequence must reduce
to a single numeric value. 

node ::= "{" item* "}"
	R = C = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}"
	B = locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E])))
	M = L = S = T = Nil
-- Nodes have nested environments, and affect the containing environment only
through persistent (:=) bindings to ids with outer VAR (:) bindings.  The items
of a node are implicitly prefixed with the id Sub, which may be bound to any
information intended to be common to all subnodes in a scope.

item* ::= ""
	R = C = M = L = S = T = Nil
	B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.

item* ::= item1 item*
	R = R<item1>(E) R<item*>(B<item1>(E))
	B = B<item*>(B<item1>(E))
    For F in {C, M, L, S, T}:
	F = F<item1> F<item*>
-- In general, the value of a sequence of items is just the sequence of item
values; binding items affect the environment of items to their right; Nil does not
change the length of a result sequence.

binding ::= name mode rhs
	R = Nil
	B = bind(name, mode, R<rhs>(E), E)
    where
	bind(id, mode, value, E) = CASE
		bindingOf(id, E) = "="	=> E		-- Can't rebind constants
		mode = ":=" 			=> assign(id, value, E)
		True				=> [E | id mode value]
	bind(id "." name, mode, value, E) =
		[E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]
	bindingOf(id, E) = locBinding(id, whereBound(id, E))
	assign(id, value, E) = CASE
		locBinding(id, E) = ":"	=> [E | id ":" value]
		bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
		True				=> E 		-- Can only assign to vars
-- This adds a single binding to E; bindings have no other "side effects" and no
value.

-- Each environment, E, initially contains only its "inherited" environment
(bound to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

binding ::= name mode op term
	= <name mode name op term>
-- This is just a convenient piece of syntactic sugar for the common case of
updating a binding.

rhs ::= "'" item* "'"
	R = item*
-- If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.


rhs ::= "[|" binding* "]"
	R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that may be used much like a record.

rhs ::= "[" invocation "|" binding* "]"
	R =[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that is an extension of an existing one.

mark ::= invocation "#"
	R = R<invocation>(E) "#"
	M = invocation
	B = E
	C = L = S = T = Nil
-- This gives the containing node the property denoted by the mark to which
the invocation reduces.

link ::= id "@!"
	R = id "@!"
	L = id
	B = E
	C = M = S = T = Nil
-- This defines the scope of the set of links whose "main" component is id.

-- A label N! on a node makes that node a "target" of the link N (and its
prefixes); a label N@ makes it a "source."  The "main" identifier of a link must
be declared (using id@!) at the root of a subtree containing all its sources and
targets.  The link represents a set of directed arcs, one from each of its sources to
each of its targets.  Multiple target labels make a node the target of multiple
links.  A target label that appears only on a single node places it in a singleton
set, i.e., identifies it uniquely.

link ::= name "@"
	R = name "@"
	S = name
	B = E
	C = M = L = T = Nil
-- This identifies the containing node as a "source" of the link name.

link ::= name "!"
	R = name "!"
	T = prefixes(name)
	B = E
	C = M = L = S = Nil
    where
	prefixes(id) = id
	prefixes(name "." id) = name "." id prefixes(name)
-- This identifies the containing node as a "target" of each of the links that is a
prefix of name.



DISCUSSION

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name $id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (marks and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, -, *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
	( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.

GRAMMATICAL FEATURE X SEMANTIC FUNCTION MATRIX

FEATURES:			       FUNCTIONS:	R C	B  M	L S	T
term ::= primary op term				+  =	-   -	-   -	-	
primary ::= literal					==	-   -	-   -	-
invocation ::= id					+  -	+  -	-   -	-
invocation ::= name "." id 				+  -	+  -	-   -	-
universal ::= "$" name				==	-   -	-   -	-
application ::= invocation "[" item* "]"		+  -	-   -	-   -	-
selection ::= "(" term "|" item1* "|" item2* ")"	+  -	+  -	-   -	-
node ::= "{" item* "}"				+  =	+  -	-   -	-
sequence ::= "(" ( value | binding )* ")"		+  =	+  -	-   -	-
item* ::= item1 item*				+  +	+  +	+  +	+
binding ::= name mode rhs				-   -	+  -	-   -	-
rhs ::= "'" item* "'"					+  -	-   -	-   -	-
rhs ::= "[|" binding* "]"				+  -	-   -	-   -	-
rhs ::= "[" invocation "|" binding* "]"		+  -	-   -	-   -	-
mark ::= invocation "#"				+  -	-   +	-   -	-
link ::= id "@!"					=-	-   -	+   -	-
link ::= name "@"					=-	-   -	-   +	-
link ::= name "!"					=-	-   -	-   -	+

- Semantic function produces Nil or E or does not apply.
+ Non-trivial semantic equation.
=For R: passes value unchanged; for C: value same as R.


-------------------

*start*
16832 00024 USt
Date: 12 May 1982 4:55 pm PDT (Wednesday)
From: Mitchell.PA
To: Interdoc.PA
Subject: Interdoc syntax and semantics
Categories: Save
Reply-To: Mitchell.PA


-------------------

The syntax has been brought up to date and the semantics have been updated to add some new features

(1) the meaning of a tag (formerly mark) includes evaluating the tag name as well so that its default bindings can be obtained simply by writing the tag (e.g., PARA$ places the tag PARA on the node and evaluates PARA%). 

(2) the notion of a scope (the "unit" that owns  an environment) has been added. 

-------------------


GRAMMAR

script    	::= versionId node
versionID	::= "Interscript/Interchange/1.0 "
content	::= term | node
term    	::= primary | primary op term
op		::= "+" | "-" | "*" | "/"
primary	::= literal | invocation | indirection | application | selection | vector
literal    	::= Boolean | integer | intSequence | real | string | universal
universal	::= ucID ( "." ucID )*
name    	::= id ( "." id )*
invocation	::= name
indirection	::= name "%"
application	::= ( name | universal ) "[" scope* "]"
selection	::= "(" term "|" item* "|" item* ")"
vector	::= "(" scope* ")"
node    	::= "{" nodeItem* "}"
nodeItem	::= label | scope
scope		::= binding* content content*
binding	::= name mode rhs
mode    	::= "←" | "=" | ":="
rhs    	::= content | op term | "'" item* "'" | "[" [ primary ] "|" binding* "]"
item    	::= label | binding | content
label    	::= tag | link
tag    	::= universal "$"
link    	::= id "@!" | name "@" | name "!"

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
     Null denotes the "empty" environment
     [E | id m e] means "E with id mode m bound to e"
     locBinding(id, E) denotes the binding mode of id in E
         locBinding(id, Null) = None
         locBinding(id, [E | id' m e]) =
                 if id=id' then m else locBinding(id, E)
     locVal(id, E) denotes the value locally bound to id in E
         locVal(id, Null) = Nil = ""
         locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

SEMANTIC FUNCTIONS

R and B are intended to propagate the effects of the environment into an expression.

R: expression, environment --> expression                    -- Reduction
   R is used for evaluating right-hand sides: identifiers, expressions, etc.

B: expression, environment --> environment              -- Bindings
   B indicates the effect a binding has on an environment.  B and R are
   mutually recursive functions (e.g.,  the evaluation of an expression may
   cause some bindings to occur as well)

The following five functions all apply to expressions independent of environment and are intended to be used on the result of reducing an expression in an environment.

C: expression --> expression                                -- Contents
   C is basically used to indicate which evaluated expressions become part of the
   content of a node 

The following four semantic functions occur less frequently in any substantive
way in the semantics below.  You might wish to skip them until they occur in a
nontrivial manner in the semantics.

T: expression --> expression                                -- Tags
   T indicates when an identifier is to be included in the tag set for a node

L: expression --> expression                                -- Links
   L indicates link declarations

LF: expression --> expression                                -- Links From
   LF indicates a link to the set of nodes having associated target links

LT: expression --> expression                               -- Links To
   LT indicates that the node is to be included in the target set of all the names
   which are prefixes of the name to which the expression should evaluate

PRESENTATION BY FEATURE

[E is used to represent the value of the environment in which the feature
occurs.]

script ::= versionId node
     R = C = R<node>(EXTERNAL)
     B=EXTERNAL
     T = L = LF = LT = Nil
-- a script is evaluated in the pre-existing EXTERNAL environment common to all Interscript/Interchange/1.0 scripts

term ::= primary op term
op ::= "+" | "-" | "*" | "/"
     R = C = R<primary>(E) op R<term>(E)
     B = E
     T = L = LF = LT = Nil
-- Both the primary and the term must reduce to numbers; the arithmetic
operators are evaluated right-to-left (a la APL, without precedence) and bind less
tightly than application.

primary ::= literal
literal ::= Boolean | integer | intSequence | real | string | universal
     R = C = literal
     B = E
     T = L = LF = LT = Nil
-- The basic contents of a document.

universal ::= ucID
     R = C = ucID
     B = E
     T = L = LF = LT = Nil
-- universals (all upper case) are presumed to be directly meaningful, and are not looked up in the environment.

universal ::= universal "." ucID
     R = C = universal "." ucID
     B = E
     T = L = LF = LT = Nil
-- a qualified universal also just stands for itself

invocation ::= id
     R = R<valOf(id, E)>(E)
     B = B<valOf(id, E)>(E)
    where
     valOf(id, E) = CASE
         whereBound(id, E) = Null	=> MakeUniversal(id)
         whereBound(id, E) = Nil		=> Nil
         True					=> locVal(id, whereBound(id, E))
    and
     whereBound(id, E) = CASE                    -- Gets innermost binding
         locBinding(id, E) ~= None	=> E
         locBinding("Outer", E) ~= None	=> whereBound(id, locVal("Outer", E))
         E=EXTERNAL			=> Null
         True					=> Nil
-- Makeuniversal(id) produces the universal corresponding to id (in the current version its uppercase equivalent)
-- Both attributes and definitions are looked up in the current environment;
depending on the current binding of id, this may produce values and/or
bindings; if the binding's rhs was quoted, the expression is evaluated at the
point of invocation.
-- When id is referred to and locBinding(id, E)=None, then the value is sought recursively in locVal("Outer", E).  The outermost environment, EXTERNAL, binds each id to an universal which is the uppercase version of the id.  Otherwise, the value of the id is assumed to be Nil

invocation ::= name "." id 
     R = R<valOf(id, R<name>(E))>(E)
     B = B<valOf(id, R<name>(E))>(E)
-- Qualified names are treated as "nested" environments.

indirection ::= name "%" 
     R = R<valOf(id, R<name>(E))>(E)
     B = B<valOf(id, R<name>(E))>(E)
-- Indirection combines the facility for invocation plus recording the fact that the expansion resulted from evaluating a particular name (recording the indirection is not yet included in these semantics).

application ::= name "[" scope* "]"
     R = apply(name, R<scope*>(E), E)
     B = E
where
   apply(name, value*, E) =
       CASE R<name>(E) OF
          "EQUAL"	=> value1 = value2
          "GREATER"=> value1 > value2
          . . .
          "SUBSCRIPT"=> value1[value2]	-- value1: sequence, value2: int
          "CONTENTS"=> "(" C<inner(value1)> ")"
          "TAGS"	=> "(" T<inner(value1)> ")"    -- ?? this doesn't seem right
          "LINKS"	=> "(" L<inner(value1)> ")"
          "SOURCES"	=> "(" LF<inner(value1)> ")"
          "TARGETS"=> "(" LT<inner(value1)> ")"
          ELSE	=> R<name>([[Null | "Outer" "=" E] | "Value" "=" value*])
and where
   inner("{" value* "}") = value*
-- If the name does not evaluate to one of the standard external function
names, the current environment is augmented with a binding of the value of the
argument list to the identifier Value, and the value is the result of the
invocation in that environment; this allows function definition within the
language.

selection ::= "(" term "|" nodeItem1* "|" nodeItem2* ")"
     R = if R<term>(E) then R<nodeItem1*>(E) else R<nodeItem2*>(E)
     B = if R<term>(E) then B<nodeItem1*>(E) else B<nodeItem2*>(E)
-- The notation for selections (conditionals) is borrowed from Algol 68:
     ( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of nodeItems (including none). 

vector ::= "(" scope* ")"
     R = C = "(" R<scope*>(E) ")"
     B = B<scope*>(E)
     T = L = LF = LT = Nil
-- Parentheses group a sequence of values as a single, vector value; bindings in the sequence of scopes affect the environment of scopes to the right in the containing node, but labels are disallowed.  Parentheses may also be used to override the right-to-left evaluation of arithmetic operators; an operand sequence must reduce to a single numeric value. 

node ::= "{" nodeItem* "}"
     R = C = "{" R<"Sub$" nodeItem*>([Null | "Outer" "=" E]) "}"
     B = locVal("Outer", (B<"Sub" nodeItem*>([Null | "Outer" "=" E])))
     T = L = LF = LT = Nil
-- Nodes have nested environments and affect the containing environment only
through global (:=) bindings.  The nodeItems of a node are implicitly prefixed with the id Sub, which may be bound to any information intended to be common to all subnodes in a scope.

nodeItem* ::= ""
     R = C = T = L = LF = LT = Nil
     B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.


nodeItem* ::= binding* content1 content* 
     R = R<content*>(R<content1>(B<binding*>(E))
     B = B<content*>(B<content1>(B<binding*>(E))
     C = C<content*>(C<content1>(B<binding*>(E))
   For F in {T, L, LF, LT}:
     F = F<label> F<nodeItem*>

nodeItem* ::= label nodeItem*
     R = R<label>(E) R<nodeItem*>(B<label>(E))
     B = B<nodeItem*>(B<label>(E))
     C = Nil
   For F in {T, L, LF, LT}:
     F = F<label> F<nodeItem*>
-- In general, the value of a sequence of nodeItems is just the sequence of nodeItem values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence.

nodeItem* ::= scope nodeItem*
   For F in {R, B, C, T, L, LF, LT}
     F = F<scope>(E) F<nodeItem*>(B<scope>(E))
   For F in {C, T, L, LF, LT}:
     F = F<scope> F<nodeItem*>
-- In general, the value of a sequence of nodeItems is just the sequence of nodeItem values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence.

item* ::= ""
     R = C = T = L = LF = LT = Nil
     B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.

item* ::= item1 item*
     R = R<item1>(E) R<item*>(B<item1>(E))
     B = B<item*>(B<item1>(E))
    For F in {C, T, L, LF, LT}:
     F = F<item1> F<item*>
-- In general, the value of a sequence of items is just the sequence of item
values; binding items affect the environment of items to their right; Nil does not
change the length of a result sequence.

binding ::= name mode rhs   -- how can we change this to create micro-scopes??
     R = Nil
     B = bind(name, mode, R<rhs>(E), E)
    where
     bind(id, mode, value, E) = CASE
         bindingOf(id, E) = "="	=> E            -- Can't rebind constants
         mode = ":="			=> assign(id, value, E)
         True				=> [E | id mode value]
     bind(id "." name, mode, value, E) =
         [E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]
     bindingOf(id, E) = locBinding(id, whereBound(id, E))
     assign(id, value, E) = CASE
         locBinding(id, E) = ":"	=> [E | id ":" value]
         bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
         True				=> E             -- Can only assign to vars
-- This adds a single binding to E; bindings have no other "side effects" and no
value.

-- Each environment, E, initially contains only its "inherited" environment
(bound to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

binding ::= name mode op term
     R = Nil
     B = bind(name, mode, R<name op term>(E), E)
-- This is just a convenient piece of syntactic sugar for the common case of
updating a binding.

rhs ::= "'" item* "'"
     R = item*
-- If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.


rhs ::= "[|" binding* "]"
     R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that may be used much like a record.

rhs ::= "[" [ item* ] "|" binding* "]"
     R =[B<binding*>([R<item*>(E) | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that is an extension of an existing one.


tag ::= universal "$"
     R = R<"default" "." universal "%">(E)
     B = B<"default" "." universal "%">(E)
     C = C<"default" "." universal "%">(E)
     T = universal
     L = LF = LT = Nil
-- This gives the containing node the property denoted by the tag named by the universal and also evaluates the indirection "default.universal%".

link ::= id "@!"
     R = id "@!"
     B = E
     L = id
     C = T = LF = LT = Nil
-- This defines the scope of the set of links whose "main" component is id.

-- A label N! on a node makes that node a "target" of the link N (and its
prefixes); a label N@ makes it a "source."  The "main" identifier of a link must
be declared (using id@!) at the root of a subtree containing all its sources and
targets.  The link represents a set of directed arcs, one from each of its sources to
each of its targets.  Multiple target labels make a node the target of multiple
links.  A target label that appears only on a single node places it in a singleton
set, i.e., identifies it uniquely.

link ::= name "@"
     R = name "@"    -- ?? why isn't R=Nil?
     B = E
     LF = name
     C = T = L = LT = Nil
-- This identifies the containing node as a "source" of the link name.

link ::= name "!"
     R = name "!"    -- ?? why isn't R=Nil?
     B = E
     LT = prefixes(name)
     C = T = L = LF = Nil
    where
     prefixes(id) = id
     prefixes(name "." id) = name "." id prefixes(name)
-- This identifies the containing node as a "target" of each of the links that is a
prefix of name.



NOTES

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name formed by using the uppercase version of each character of id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (tags and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, -, *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
     ( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.

-------------------

*start*
00361 00024 USt
Date: 20-Jul-82 12:03:16 PDT (Tuesday)
From: Karlton.PA
Subject: Re: The Interscript grammar
In-reply-to: Mitchell's message of 18 July 1982 7:42 pm PDT (Sunday)
To: Mitchell
cc: Karlton, Stepak

I took a quick look at the grammar and I noticed

rhs	::=	... |  "[" [ item* ] "|" binding* "]"

Why are there brackets around 'item*'?

PK

*start*
00722 00024 USt
Date: 20 July 1982 5:58 pm PDT (Tuesday)
From: Mitchell.PA
Subject: Re: The Interscript grammar
In-reply-to: Karlton's message of 20-Jul-82 12:03:16 PDT (Tuesday)
To: Karlton
cc: Mitchell, Stepak

rhs	::=	... |  "[" [ item* ] "|" binding* "]"

The brackets indicate that the item* sequence is optional.  One can write
something like

Times ← [font | face.style ← ROMAN size←10*pt]

or, something like

complexNum ← [| real←1.0 imag←2.0]

In the first case, Times is initialized to the value of font (which better be a
record/environment) and then the two following bindings are added.  In the
second, complexNum is bound to a Null environment augmented with the
bindings to real and imag.

Jim M.

*start*
00278 00024 USt
Date: 20-Jul-82 18:25:26 PDT (Tuesday)
From: Karlton.PA
Subject: Re: The Interscript grammar
In-reply-to: Mitchell's message of 20 July 1982 5:58 pm PDT (Tuesday)
To: Mitchell
cc: Karlton, Stepak

Yes, but how is "item*" different from "[ item* ]"?

PK

*start*
00510 00024 USt
Date: 21 July 1982 9:51 am PDT (Wednesday)
From: Mitchell.PA
Subject: Re: The Interscript grammar
In-reply-to: Karlton's message of 20-Jul-82 18:25:26 PDT (Tuesday)
To: Karlton
cc: Mitchell, Stepak

The brackets around item* are part of the BNF (note that they are not in quotes)
and only indicate that the item* is optional:

rhs	::=	... |  "[" [ item* ] "|" binding* "]"

An alternate way of specifying this is

rhs ::= "[" "|" binding* "]"
rhs ::= "[" item* "|" binding* "]"

Jim M

*start*
00649 00024 USt
Date: 22-Jul-82 11:50:27 PDT (Thursday)
From: Karlton.PA
Subject: Interscript is LALR
To: Mitchell
cc: Stepak, Karlton

I went to Satterthwaite and he fixed up my grammar for me. The problem I was having was in my definition
	itemlist ::=  <null> | item | <itemlist> item
It turns out that this allows an infinte number of <null>s at the front of the list and is ambiguous. The trick (which I don't think I would have guessed in quite some time) is to redefine it to be
	itemlist ::=  <null> | itemlist'
	itemlist' ::= item | itemlist' item


Now my problem is to convert this LALR grammar into an obvious LL[1] grammar.

PK
*start*
01341 00024 US 
Date: 27-Jul-82 15:29:28 PDT (Tuesday)
From: Karlton.PA
Subject: Interscript grammar
To: Mitchell, Horning
cc: Karlton

I wish to unilaterally change the grammar in minor way (just to make my life a little easier), and I wanted to run the change through you. Currently the following definitions exist

	string		::=  "<" stringElem* ">"
	stringElem	::=  stringChar | intSequence
	stringChar	::=  <<any character but ">" or "#">>
	intSequence	::=  "#" intOrHex* "#"
	intOrHex	::=  integer | hexChar hexChar

as well as

	literal		::= ... | intSequence | ...

What I wish to do is to define a different category for a sequence of values that may exist in a string. As long as programs are creating these strings, I want to restrict the escape sequences in strings to be a sequence of hex characters. This would entail making the following changes

	stringElem	::=  stringChar | hexSequence
	hexSequence	::=  "#" hex* "#"
	hex		::=  hexChar hexChar

What is the value of having intSequence as a possible literal? (Of course a vector of integers has lots of uses, but that is not what I am wondering about.) It might just be the right thing to eliminate it entirely from the grammar. It just allows a redundant way to say something and adds no function. I agree that an escape sequence is important, but there is one.

PK
*start*
00577 00024 US 
Date: 27-Jul-82 16:03:23 PDT
From: Horning.pa
Subject: Re: Interscript grammar
To: Karlton
cc: Mitchell, Horning.pa

Phil,

I believe that section of the grammar was copied pretty slavishly from Interpress, with not too much thought about whether we needed the flexibility. Of course, Interpress uses vectors of integers for lots of things that have nothing to do with strings, and takes the position that a string is just a special case.

I certainly don't have strong feelings on this point.

Have you asked Bob Ayers his feelings about it?

Jim H.
 
*start*
01478 00024 US 
Date: 23-Jul-82 11:33:13 PDT (Friday)
From: Karlton.PA
Subject: Interscript TWG meeting of Friday, 16 July
To: Interdoc

We had a short(?) meeting at which the subcommittees reported their progress.

The implementation group reported no progress on the parser, but Jim Mitchell will send a copy of the grammar to Karlton over the weekend and he will start doing working on the parser.

The 820 group now has a PASCAL compiler. They have contracted with Bill Duvall to build a "crummy, dirty" editor in C that they will interface to. It was decided that interacting with Word-Star was too messy. A point was raised questioning whether SDD should continue working on an 820 based editor considering the reorganization into OSD and OPD that has recently taken place.

The document modelling group reported that they decided to start freezing things so that progress can begin on pinning down some details on the semantics of those items about which there was a consensus. A request was made that someone show how bold, italic and paragraph margins would end up looking. It was explained that boldness would probably be related to some weight associated with a font and that italic would be indicated by the value assigned to the slant. Paragraph margins were a little less clear. [At this point there was a slight bit of contention about what was agreed to so far.] 

There was a call for example scripts, and Jean-Marie has volunteered to gather some.

PK