*start* 14263 00024 USt Date: 5 March 1982 2:25 pm PST (Friday) From: Mitchell.PA Subject: Interdoc syntax and semantics To: Interdoc.pa Reply-To: Mitchell.pa Here are the current versions of the Interdoc syntax and semantics that I promised to distribute to you all at this morning's meeting. I have laced the semantic equations with some explanatory text to help you wade through them, but I am under no illusions that this has made them easy to read. Please feel free to send me questions about them. Such questions will undoubtedly help us to clarify the presentation. Jim M. ------------------- GRAMMAR script ::= versionId item item ::= value | binding | label value ::= term | node term ::= primary | primary op term op ::= "+" | "-" | "*" | "/" primary ::= literal | invocation | application | selection | sequence literal ::= Boolean | integer | hexint | real | string invocation ::= name | universal name ::= id ( "." id )* universal ::= "$" name application ::= invocation "[" item* "]" selection ::= "(" term "|" item* "|" item* ")" sequence ::= "(" ( value | binding )* ")" node ::= "{" item* "}" binding ::= name mode rhs mode ::= "=" | ":" | ":=" | "←" rhs ::= value | op term | "'" item* "'" | "[" [ invocation ] "|" binding* "]" label ::= mark | link mark ::= invocation "#" link ::= id "@!" | name "@" | name "!" NOTATION FOR ENVIRONMENTS Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"): Null denotes the "empty" environment [E | id m e] means "E with id mode m bound to e" locBinding(id, E) denotes the binding mode of id in E locBinding(id, Null) = None locBinding(id, [E | id' m e]) = if id=id' then m else locBinding(id, E) locVal(id, E) denotes the value locally bound to id in E locVal(id, Null) = Nil = "" locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E) SEMANTIC FUNCTIONS R: expression, environment --> expression -- Reduction R is used for evaluating right-hand sides: identifiers, expressions, etc. C: expression --> expression -- Contents C is basically used to indicate which evaluated expressions become part of the content of a node B: expression, environment --> environment -- Bindings B indicates the effect a binding has on an environment. B and R are mutually recursive functions (e.g., the evaluation of an expression may cause some bindings to occur as well) The following four semantic functions occur less frequently in any substantive way in the semantics below. You might wish to skip them until they occur in a nontrivial manner in the semantics. M: expression --> expression -- Marks M indicates when an identifier is to be included in the mark set for a node L: expression --> expression -- Links L indicates link declarations S: expression --> expression -- Link sources S indicates a link to the set of nodes having associated target links T: expression --> expression -- Link targets T indicates that the node is to be included in the target set of all the names which are prefixes of the name to which the expression should evaluate PRESENTATION BY FEATURE [E is used to represent the value of the environment in which the feature occurs.] term ::= primary op term op ::= "+" | "-" | "*" | "/" R = C = R<primary>(E) op R<term>(E) B = E M = L = S = T = Nil -- Both the primary and the term must reduce to numbers; the arithmetic operators are evaluated right-to-left (a la APL, without precedence) and bind less tightly than application. primary ::= literal literal ::= Boolean | integer | hexint | real | string R = C = literal B = E M = L = S = T = Nil -- The basic contents of a document. invocation ::= id R = R<valOf(id, E)>(E) B = B<valOf(id, E)>(E) where valOf(id, E) = locVal(id, whereBound(id, E)) -- Gets innermost value whereBound(id, E) = CASE -- Gets innermost binding locBinding(id, E) ~= None => E locBinding("Outer", E) ~= None => whereBound(id, locVal("Outer", E)) True => Null -- Both attributes and definitions are looked up in the current environment; depending on the current binding of id, this may produce values and/or bindings; if the binding's rhs was quoted, the expression is evaluated at the point of invocation. -- When an id is referred to and locBinding(id, E)=None, then the value is sought recursively in locVal("Outer"). The (implicit) "outermost" environment binds each id to the "universal" name $id. invocation ::= name "." id R = R<valOf(id, R<name>(E))>(E) B = B<valOf(id, R<name>(E))>(E) -- Qualified names are treated as "nested" environments. universal ::= "$" name R = C = "$" name B = E M = L = S = T = Nil -- Names prefixed with a $ are presumed to be directly meaningful, and are not looked up in the environment. application ::= invocation "[" item* "]" R = apply(invocation, R<item*>(E), E) B = E where apply(invocation, value*, E) = CASE R<invocation>(E) OF "$equal" => value1 = value2 "$greater" => value1 > value2 . . . "$subscript" => value1[value2] -- value1: sequence, value2: int "$contents" => "(" C<inner(value1)> ")" "$marks" => "(" M<inner(value1)> ")" "$links" => "(" L<inner(value1)> ")" "$sources" => "(" S<inner(value1)> ")" "$targets" => "(" T<inner(value1)> ")" ELSE => R<invocation>([[Null | "Outer" "=" E] | "Value" "=" value*]) inner("{" value* "}") = value* -- If the invocation does not evaluate to one of the standard external function names, the current environment is augmented with a binding of the value of the argument list to the identifier Value, and the value is the result of the invocation in that environment; this allows function definition within the language. selection ::= "(" term "|" item1* "|" item2* ")" R = if R<term>(E) then R<item1*>(E) else R<item2*>(E) B = if R<term>(E) then B<item1*>(E) else B<item2*>(E) -- The notation for selections (conditionals) is borrowed from Algol 68: ( <test> | <true part> | <false part> ) This is consistent with our principles of using balanced brackets for compound constructions and avoiding syntactically reserved words; the true part and false part may each contain an arbitrary number of items (including none). sequence ::= "(" item* ")" R = C = "(" R<item*>(E) ")" B = B<item*>(E) M = L = S = T = Nil -- Parentheses group a sequence of items as a single value; bindings in the sequence affect the environment of items to the right in the containing node, but labels are disallowed. Parentheses may also be used to override the right-to-left evaluation of arithmetic operators; an operand sequence must reduce to a single numeric value. node ::= "{" item* "}" R = C = "{" R<"Sub" item*>([Null | "Outer" "=" E]) "}" B = locVal("Outer", (B<"Sub" item*>([Null | "Outer" "=" E]))) M = L = S = T = Nil -- Nodes have nested environments, and affect the containing environment only through persistent (:=) bindings to ids with outer VAR (:) bindings. The items of a node are implicitly prefixed with the id Sub, which may be bound to any information intended to be common to all subnodes in a scope. item* ::= "" R = C = M = L = S = T = Nil B = E -- The empty sequence of items has no value and no effect; this is the basis for the following recursive definition. item* ::= item1 item* R = R<item1>(E) R<item*>(B<item1>(E)) B = B<item*>(B<item1>(E)) For F in {C, M, L, S, T}: F = F<item1> F<item*> -- In general, the value of a sequence of items is just the sequence of item values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence. binding ::= name mode rhs R = Nil B = bind(name, mode, R<rhs>(E), E) where bind(id, mode, value, E) = CASE bindingOf(id, E) = "=" => E -- Can't rebind constants mode = ":=" => assign(id, value, E) True => [E | id mode value] bind(id "." name, mode, value, E) = [E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))] bindingOf(id, E) = locBinding(id, whereBound(id, E)) assign(id, value, E) = CASE locBinding(id, E) = ":" => [E | id ":" value] bindingOf(id, E) = ":" => bind("Outer." id, ":=", value, E) True => E -- Can only assign to vars -- This adds a single binding to E; bindings have no other "side effects" and no value. -- Each environment, E, initially contains only its "inherited" environment (bound to the id Outer). Most bindings take place directly in E. To allow for "persistent" bindings, the value of a bind(id, ":=", val, E) will change E by rebinding id in the "innermost" environment (following the chain of Outers) in which it is bound, if that binding has the binding ":" (Var). Identifiers bound with binding "=" (Const) may not be rebound in inner environments. binding ::= name mode op term = <name mode name op term> -- This is just a convenient piece of syntactic sugar for the common case of updating a binding. rhs ::= "'" item* "'" R = item* -- If the rhs of a binding is surrounded by single quotes, it will be evaluated in the environments where the name is invoked, rather than the environment in which the binding is made. rhs ::= "[|" binding* "]" R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null] -- This creates a new environment value that may be used much like a record. rhs ::= "[" invocation "|" binding* "]" R =[B<binding*>([R<invocation>(E) | "Outer" "=" E]) | "Outer" "=" Null] -- This creates a new environment value that is an extension of an existing one. mark ::= invocation "#" R = R<invocation>(E) "#" M = invocation B = E C = L = S = T = Nil -- This gives the containing node the property denoted by the mark to which the invocation reduces. link ::= id "@!" R = id "@!" L = id B = E C = M = S = T = Nil -- This defines the scope of the set of links whose "main" component is id. -- A label N! on a node makes that node a "target" of the link N (and its prefixes); a label N@ makes it a "source." The "main" identifier of a link must be declared (using id@!) at the root of a subtree containing all its sources and targets. The link represents a set of directed arcs, one from each of its sources to each of its targets. Multiple target labels make a node the target of multiple links. A target label that appears only on a single node places it in a singleton set, i.e., identifies it uniquely. link ::= name "@" R = name "@" S = name B = E C = M = L = T = Nil -- This identifies the containing node as a "source" of the link name. link ::= name "!" R = name "!" T = prefixes(name) B = E C = M = L = S = Nil where prefixes(id) = id prefixes(name "." id) = name "." id prefixes(name) -- This identifies the containing node as a "target" of each of the links that is a prefix of name. DISCUSSION Each environment, E, initially contains only its "inherited" environment (bound to the id Outer). Most bindings take place directly in E. To allow for "persistent" bindings, the value of a bind(id, ":=", val, E) will change E by rebinding id in the "innermost" environment (following the chain of Outers) in which it is bound, if that binding has the binding ":" (Var). Identifiers bound with binding "=" (Const) may not be rebound in inner environments. If the rhs of a binding is surrounded by single quotes, it will be evaluated in the environments where the name is invoked, rather than the environment in which the binding is made. When an id is referred to and locBinding(id, E)=None, then the value is sought recursively in locVal("Outer"). The (implicit) "outermost" environment binds each id to the "universal" name $id. Nodes are delimited by brackets. The contents of each node are implicitly prefixed by Sub, which will generally be bound in the containing environment to a quoted expression performing some bindings, and perhaps supplying some labels (marks and links). Parentheses are used to delimit sequence values. Square brackets are used to delimit the argument list of an operator application and to denote environment constructors, which behave much like records. Expressions involving the four infix ops (+, -, *, /) are evaluated right-to-left (a la APL); since we expect expressions to be short, we have not imposed precedence rules. The notation for selections (conditionals) is borrowed from Algol 68: ( <test> | <true part> | <false part> ) This is consistent with our principles of using balanced brackets for compound constructions and avoiding syntactically reserved words; the true part and false part may each contain an arbitrary number of items (including none). A label N! on a node makes that node a "target" of the link N (and its prefixes); a label N@ makes it a "source." The "main" identifier of a link must be declared (using id@!) at the root of a subtree containing all its sources and targets. The link represents a set of directed arcs, one from each of its sources to each of its targets. Multiple target labels make a node the target of multiple links. A target label that appears only on a single node places it in a singleton set, i.e., identifies it uniquely. GRAMMATICAL FEATURE X SEMANTIC FUNCTION MATRIX FEATURES: FUNCTIONS: R C B M L S T term ::= primary op term + = - - - - - primary ::= literal == - - - - - invocation ::= id + - + - - - - invocation ::= name "." id + - + - - - - universal ::= "$" name == - - - - - application ::= invocation "[" item* "]" + - - - - - - selection ::= "(" term "|" item1* "|" item2* ")" + - + - - - - node ::= "{" item* "}" + = + - - - - sequence ::= "(" ( value | binding )* ")" + = + - - - - item* ::= item1 item* + + + + + + + binding ::= name mode rhs - - + - - - - rhs ::= "'" item* "'" + - - - - - - rhs ::= "[|" binding* "]" + - - - - - - rhs ::= "[" invocation "|" binding* "]" + - - - - - - mark ::= invocation "#" + - - + - - - link ::= id "@!" =- - - + - - link ::= name "@" =- - - - + - link ::= name "!" =- - - - - + - Semantic function produces Nil or E or does not apply. + Non-trivial semantic equation. =For R: passes value unchanged; for C: value same as R. ------------------- *start* 16832 00024 USt Date: 12 May 1982 4:55 pm PDT (Wednesday) From: Mitchell.PA To: Interdoc.PA Subject: Interdoc syntax and semantics Categories: Save Reply-To: Mitchell.PA ------------------- The syntax has been brought up to date and the semantics have been updated to add some new features (1) the meaning of a tag (formerly mark) includes evaluating the tag name as well so that its default bindings can be obtained simply by writing the tag (e.g., PARA$ places the tag PARA on the node and evaluates PARA%). (2) the notion of a scope (the "unit" that owns an environment) has been added. ------------------- GRAMMAR script ::= versionId node versionID ::= "Interscript/Interchange/1.0 " content ::= term | node term ::= primary | primary op term op ::= "+" | "-" | "*" | "/" primary ::= literal | invocation | indirection | application | selection | vector literal ::= Boolean | integer | intSequence | real | string | universal universal ::= ucID ( "." ucID )* name ::= id ( "." id )* invocation ::= name indirection ::= name "%" application ::= ( name | universal ) "[" scope* "]" selection ::= "(" term "|" item* "|" item* ")" vector ::= "(" scope* ")" node ::= "{" nodeItem* "}" nodeItem ::= label | scope scope ::= binding* content content* binding ::= name mode rhs mode ::= "←" | "=" | ":=" rhs ::= content | op term | "'" item* "'" | "[" [ primary ] "|" binding* "]" item ::= label | binding | content label ::= tag | link tag ::= universal "$" link ::= id "@!" | name "@" | name "!" NOTATION FOR ENVIRONMENTS Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"): Null denotes the "empty" environment [E | id m e] means "E with id mode m bound to e" locBinding(id, E) denotes the binding mode of id in E locBinding(id, Null) = None locBinding(id, [E | id' m e]) = if id=id' then m else locBinding(id, E) locVal(id, E) denotes the value locally bound to id in E locVal(id, Null) = Nil = "" locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E) SEMANTIC FUNCTIONS R and B are intended to propagate the effects of the environment into an expression. R: expression, environment --> expression -- Reduction R is used for evaluating right-hand sides: identifiers, expressions, etc. B: expression, environment --> environment -- Bindings B indicates the effect a binding has on an environment. B and R are mutually recursive functions (e.g., the evaluation of an expression may cause some bindings to occur as well) The following five functions all apply to expressions independent of environment and are intended to be used on the result of reducing an expression in an environment. C: expression --> expression -- Contents C is basically used to indicate which evaluated expressions become part of the content of a node The following four semantic functions occur less frequently in any substantive way in the semantics below. You might wish to skip them until they occur in a nontrivial manner in the semantics. T: expression --> expression -- Tags T indicates when an identifier is to be included in the tag set for a node L: expression --> expression -- Links L indicates link declarations LF: expression --> expression -- Links From LF indicates a link to the set of nodes having associated target links LT: expression --> expression -- Links To LT indicates that the node is to be included in the target set of all the names which are prefixes of the name to which the expression should evaluate PRESENTATION BY FEATURE [E is used to represent the value of the environment in which the feature occurs.] script ::= versionId node R = C = R<node>(EXTERNAL) B=EXTERNAL T = L = LF = LT = Nil -- a script is evaluated in the pre-existing EXTERNAL environment common to all Interscript/Interchange/1.0 scripts term ::= primary op term op ::= "+" | "-" | "*" | "/" R = C = R<primary>(E) op R<term>(E) B = E T = L = LF = LT = Nil -- Both the primary and the term must reduce to numbers; the arithmetic operators are evaluated right-to-left (a la APL, without precedence) and bind less tightly than application. primary ::= literal literal ::= Boolean | integer | intSequence | real | string | universal R = C = literal B = E T = L = LF = LT = Nil -- The basic contents of a document. universal ::= ucID R = C = ucID B = E T = L = LF = LT = Nil -- universals (all upper case) are presumed to be directly meaningful, and are not looked up in the environment. universal ::= universal "." ucID R = C = universal "." ucID B = E T = L = LF = LT = Nil -- a qualified universal also just stands for itself invocation ::= id R = R<valOf(id, E)>(E) B = B<valOf(id, E)>(E) where valOf(id, E) = CASE whereBound(id, E) = Null => MakeUniversal(id) whereBound(id, E) = Nil => Nil True => locVal(id, whereBound(id, E)) and whereBound(id, E) = CASE -- Gets innermost binding locBinding(id, E) ~= None => E locBinding("Outer", E) ~= None => whereBound(id, locVal("Outer", E)) E=EXTERNAL => Null True => Nil -- Makeuniversal(id) produces the universal corresponding to id (in the current version its uppercase equivalent) -- Both attributes and definitions are looked up in the current environment; depending on the current binding of id, this may produce values and/or bindings; if the binding's rhs was quoted, the expression is evaluated at the point of invocation. -- When id is referred to and locBinding(id, E)=None, then the value is sought recursively in locVal("Outer", E). The outermost environment, EXTERNAL, binds each id to an universal which is the uppercase version of the id. Otherwise, the value of the id is assumed to be Nil invocation ::= name "." id R = R<valOf(id, R<name>(E))>(E) B = B<valOf(id, R<name>(E))>(E) -- Qualified names are treated as "nested" environments. indirection ::= name "%" R = R<valOf(id, R<name>(E))>(E) B = B<valOf(id, R<name>(E))>(E) -- Indirection combines the facility for invocation plus recording the fact that the expansion resulted from evaluating a particular name (recording the indirection is not yet included in these semantics). application ::= name "[" scope* "]" R = apply(name, R<scope*>(E), E) B = E where apply(name, value*, E) = CASE R<name>(E) OF "EQUAL" => value1 = value2 "GREATER"=> value1 > value2 . . . "SUBSCRIPT"=> value1[value2] -- value1: sequence, value2: int "CONTENTS"=> "(" C<inner(value1)> ")" "TAGS" => "(" T<inner(value1)> ")" -- ?? this doesn't seem right "LINKS" => "(" L<inner(value1)> ")" "SOURCES" => "(" LF<inner(value1)> ")" "TARGETS"=> "(" LT<inner(value1)> ")" ELSE => R<name>([[Null | "Outer" "=" E] | "Value" "=" value*]) and where inner("{" value* "}") = value* -- If the name does not evaluate to one of the standard external function names, the current environment is augmented with a binding of the value of the argument list to the identifier Value, and the value is the result of the invocation in that environment; this allows function definition within the language. selection ::= "(" term "|" nodeItem1* "|" nodeItem2* ")" R = if R<term>(E) then R<nodeItem1*>(E) else R<nodeItem2*>(E) B = if R<term>(E) then B<nodeItem1*>(E) else B<nodeItem2*>(E) -- The notation for selections (conditionals) is borrowed from Algol 68: ( <test> | <true part> | <false part> ) This is consistent with our principles of using balanced brackets for compound constructions and avoiding syntactically reserved words; the true part and false part may each contain an arbitrary number of nodeItems (including none). vector ::= "(" scope* ")" R = C = "(" R<scope*>(E) ")" B = B<scope*>(E) T = L = LF = LT = Nil -- Parentheses group a sequence of values as a single, vector value; bindings in the sequence of scopes affect the environment of scopes to the right in the containing node, but labels are disallowed. Parentheses may also be used to override the right-to-left evaluation of arithmetic operators; an operand sequence must reduce to a single numeric value. node ::= "{" nodeItem* "}" R = C = "{" R<"Sub$" nodeItem*>([Null | "Outer" "=" E]) "}" B = locVal("Outer", (B<"Sub" nodeItem*>([Null | "Outer" "=" E]))) T = L = LF = LT = Nil -- Nodes have nested environments and affect the containing environment only through global (:=) bindings. The nodeItems of a node are implicitly prefixed with the id Sub, which may be bound to any information intended to be common to all subnodes in a scope. nodeItem* ::= "" R = C = T = L = LF = LT = Nil B = E -- The empty sequence of items has no value and no effect; this is the basis for the following recursive definition. nodeItem* ::= binding* content1 content* R = R<content*>(R<content1>(B<binding*>(E)) B = B<content*>(B<content1>(B<binding*>(E)) C = C<content*>(C<content1>(B<binding*>(E)) For F in {T, L, LF, LT}: F = F<label> F<nodeItem*> nodeItem* ::= label nodeItem* R = R<label>(E) R<nodeItem*>(B<label>(E)) B = B<nodeItem*>(B<label>(E)) C = Nil For F in {T, L, LF, LT}: F = F<label> F<nodeItem*> -- In general, the value of a sequence of nodeItems is just the sequence of nodeItem values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence. nodeItem* ::= scope nodeItem* For F in {R, B, C, T, L, LF, LT} F = F<scope>(E) F<nodeItem*>(B<scope>(E)) For F in {C, T, L, LF, LT}: F = F<scope> F<nodeItem*> -- In general, the value of a sequence of nodeItems is just the sequence of nodeItem values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence. item* ::= "" R = C = T = L = LF = LT = Nil B = E -- The empty sequence of items has no value and no effect; this is the basis for the following recursive definition. item* ::= item1 item* R = R<item1>(E) R<item*>(B<item1>(E)) B = B<item*>(B<item1>(E)) For F in {C, T, L, LF, LT}: F = F<item1> F<item*> -- In general, the value of a sequence of items is just the sequence of item values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence. binding ::= name mode rhs -- how can we change this to create micro-scopes?? R = Nil B = bind(name, mode, R<rhs>(E), E) where bind(id, mode, value, E) = CASE bindingOf(id, E) = "=" => E -- Can't rebind constants mode = ":=" => assign(id, value, E) True => [E | id mode value] bind(id "." name, mode, value, E) = [E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))] bindingOf(id, E) = locBinding(id, whereBound(id, E)) assign(id, value, E) = CASE locBinding(id, E) = ":" => [E | id ":" value] bindingOf(id, E) = ":" => bind("Outer." id, ":=", value, E) True => E -- Can only assign to vars -- This adds a single binding to E; bindings have no other "side effects" and no value. -- Each environment, E, initially contains only its "inherited" environment (bound to the id Outer). Most bindings take place directly in E. To allow for "persistent" bindings, the value of a bind(id, ":=", val, E) will change E by rebinding id in the "innermost" environment (following the chain of Outers) in which it is bound, if that binding has the binding ":" (Var). Identifiers bound with binding "=" (Const) may not be rebound in inner environments. binding ::= name mode op term R = Nil B = bind(name, mode, R<name op term>(E), E) -- This is just a convenient piece of syntactic sugar for the common case of updating a binding. rhs ::= "'" item* "'" R = item* -- If the rhs of a binding is surrounded by single quotes, it will be evaluated in the environments where the name is invoked, rather than the environment in which the binding is made. rhs ::= "[|" binding* "]" R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null] -- This creates a new environment value that may be used much like a record. rhs ::= "[" [ item* ] "|" binding* "]" R =[B<binding*>([R<item*>(E) | "Outer" "=" E]) | "Outer" "=" Null] -- This creates a new environment value that is an extension of an existing one. tag ::= universal "$" R = R<"default" "." universal "%">(E) B = B<"default" "." universal "%">(E) C = C<"default" "." universal "%">(E) T = universal L = LF = LT = Nil -- This gives the containing node the property denoted by the tag named by the universal and also evaluates the indirection "default.universal%". link ::= id "@!" R = id "@!" B = E L = id C = T = LF = LT = Nil -- This defines the scope of the set of links whose "main" component is id. -- A label N! on a node makes that node a "target" of the link N (and its prefixes); a label N@ makes it a "source." The "main" identifier of a link must be declared (using id@!) at the root of a subtree containing all its sources and targets. The link represents a set of directed arcs, one from each of its sources to each of its targets. Multiple target labels make a node the target of multiple links. A target label that appears only on a single node places it in a singleton set, i.e., identifies it uniquely. link ::= name "@" R = name "@" -- ?? why isn't R=Nil? B = E LF = name C = T = L = LT = Nil -- This identifies the containing node as a "source" of the link name. link ::= name "!" R = name "!" -- ?? why isn't R=Nil? B = E LT = prefixes(name) C = T = L = LF = Nil where prefixes(id) = id prefixes(name "." id) = name "." id prefixes(name) -- This identifies the containing node as a "target" of each of the links that is a prefix of name. NOTES Each environment, E, initially contains only its "inherited" environment (bound to the id Outer). Most bindings take place directly in E. To allow for "persistent" bindings, the value of a bind(id, ":=", val, E) will change E by rebinding id in the "innermost" environment (following the chain of Outers) in which it is bound, if that binding has the binding ":" (Var). Identifiers bound with binding "=" (Const) may not be rebound in inner environments. If the rhs of a binding is surrounded by single quotes, it will be evaluated in the environments where the name is invoked, rather than the environment in which the binding is made. When an id is referred to and locBinding(id, E)=None, then the value is sought recursively in locVal("Outer"). The (implicit) "outermost" environment binds each id to the "universal" name formed by using the uppercase version of each character of id. Nodes are delimited by brackets. The contents of each node are implicitly prefixed by Sub, which will generally be bound in the containing environment to a quoted expression performing some bindings, and perhaps supplying some labels (tags and links). Parentheses are used to delimit sequence values. Square brackets are used to delimit the argument list of an operator application and to denote environment constructors, which behave much like records. Expressions involving the four infix ops (+, -, *, /) are evaluated right-to-left (a la APL); since we expect expressions to be short, we have not imposed precedence rules. The notation for selections (conditionals) is borrowed from Algol 68: ( <test> | <true part> | <false part> ) This is consistent with our principles of using balanced brackets for compound constructions and avoiding syntactically reserved words; the true part and false part may each contain an arbitrary number of items (including none). A label N! on a node makes that node a "target" of the link N (and its prefixes); a label N@ makes it a "source." The "main" identifier of a link must be declared (using id@!) at the root of a subtree containing all its sources and targets. The link represents a set of directed arcs, one from each of its sources to each of its targets. Multiple target labels make a node the target of multiple links. A target label that appears only on a single node places it in a singleton set, i.e., identifies it uniquely. ------------------- *start* 00361 00024 USt Date: 20-Jul-82 12:03:16 PDT (Tuesday) From: Karlton.PA Subject: Re: The Interscript grammar In-reply-to: Mitchell's message of 18 July 1982 7:42 pm PDT (Sunday) To: Mitchell cc: Karlton, Stepak I took a quick look at the grammar and I noticed rhs ::= ... | "[" [ item* ] "|" binding* "]" Why are there brackets around 'item*'? PK *start* 00722 00024 USt Date: 20 July 1982 5:58 pm PDT (Tuesday) From: Mitchell.PA Subject: Re: The Interscript grammar In-reply-to: Karlton's message of 20-Jul-82 12:03:16 PDT (Tuesday) To: Karlton cc: Mitchell, Stepak rhs ::= ... | "[" [ item* ] "|" binding* "]" The brackets indicate that the item* sequence is optional. One can write something like Times ← [font | face.style ← ROMAN size←10*pt] or, something like complexNum ← [| real←1.0 imag←2.0] In the first case, Times is initialized to the value of font (which better be a record/environment) and then the two following bindings are added. In the second, complexNum is bound to a Null environment augmented with the bindings to real and imag. Jim M. *start* 00278 00024 USt Date: 20-Jul-82 18:25:26 PDT (Tuesday) From: Karlton.PA Subject: Re: The Interscript grammar In-reply-to: Mitchell's message of 20 July 1982 5:58 pm PDT (Tuesday) To: Mitchell cc: Karlton, Stepak Yes, but how is "item*" different from "[ item* ]"? PK *start* 00510 00024 USt Date: 21 July 1982 9:51 am PDT (Wednesday) From: Mitchell.PA Subject: Re: The Interscript grammar In-reply-to: Karlton's message of 20-Jul-82 18:25:26 PDT (Tuesday) To: Karlton cc: Mitchell, Stepak The brackets around item* are part of the BNF (note that they are not in quotes) and only indicate that the item* is optional: rhs ::= ... | "[" [ item* ] "|" binding* "]" An alternate way of specifying this is rhs ::= "[" "|" binding* "]" rhs ::= "[" item* "|" binding* "]" Jim M *start* 00649 00024 USt Date: 22-Jul-82 11:50:27 PDT (Thursday) From: Karlton.PA Subject: Interscript is LALR To: Mitchell cc: Stepak, Karlton I went to Satterthwaite and he fixed up my grammar for me. The problem I was having was in my definition itemlist ::= <null> | item | <itemlist> item It turns out that this allows an infinte number of <null>s at the front of the list and is ambiguous. The trick (which I don't think I would have guessed in quite some time) is to redefine it to be itemlist ::= <null> | itemlist' itemlist' ::= item | itemlist' item Now my problem is to convert this LALR grammar into an obvious LL[1] grammar. PK *start* 01341 00024 US Date: 27-Jul-82 15:29:28 PDT (Tuesday) From: Karlton.PA Subject: Interscript grammar To: Mitchell, Horning cc: Karlton I wish to unilaterally change the grammar in minor way (just to make my life a little easier), and I wanted to run the change through you. Currently the following definitions exist string ::= "<" stringElem* ">" stringElem ::= stringChar | intSequence stringChar ::= <<any character but ">" or "#">> intSequence ::= "#" intOrHex* "#" intOrHex ::= integer | hexChar hexChar as well as literal ::= ... | intSequence | ... What I wish to do is to define a different category for a sequence of values that may exist in a string. As long as programs are creating these strings, I want to restrict the escape sequences in strings to be a sequence of hex characters. This would entail making the following changes stringElem ::= stringChar | hexSequence hexSequence ::= "#" hex* "#" hex ::= hexChar hexChar What is the value of having intSequence as a possible literal? (Of course a vector of integers has lots of uses, but that is not what I am wondering about.) It might just be the right thing to eliminate it entirely from the grammar. It just allows a redundant way to say something and adds no function. I agree that an escape sequence is important, but there is one. PK *start* 00577 00024 US Date: 27-Jul-82 16:03:23 PDT From: Horning.pa Subject: Re: Interscript grammar To: Karlton cc: Mitchell, Horning.pa Phil, I believe that section of the grammar was copied pretty slavishly from Interpress, with not too much thought about whether we needed the flexibility. Of course, Interpress uses vectors of integers for lots of things that have nothing to do with strings, and takes the position that a string is just a special case. I certainly don't have strong feelings on this point. Have you asked Bob Ayers his feelings about it? Jim H. *start* 01478 00024 US Date: 23-Jul-82 11:33:13 PDT (Friday) From: Karlton.PA Subject: Interscript TWG meeting of Friday, 16 July To: Interdoc We had a short(?) meeting at which the subcommittees reported their progress. The implementation group reported no progress on the parser, but Jim Mitchell will send a copy of the grammar to Karlton over the weekend and he will start doing working on the parser. The 820 group now has a PASCAL compiler. They have contracted with Bill Duvall to build a "crummy, dirty" editor in C that they will interface to. It was decided that interacting with Word-Star was too messy. A point was raised questioning whether SDD should continue working on an 820 based editor considering the reorganization into OSD and OPD that has recently taken place. The document modelling group reported that they decided to start freezing things so that progress can begin on pinning down some details on the semantics of those items about which there was a consensus. A request was made that someone show how bold, italic and paragraph margins would end up looking. It was explained that boldness would probably be related to some weight associated with a font and that italic would be indicated by the value assigned to the slant. Paragraph margins were a little less clear. [At this point there was a slight bit of contention about what was agreed to so far.] There was a call for example scripts, and Jean-Marie has volunteered to gather some. PK