*start*
01712 00024 USt
Date: 3 May 1982 2:39 pm PDT (Monday)
From: Horning.pa
Subject: Class "Inheritance"
To: Interdoc

During a discussion Friday afternoon, Gael Curry remarked that there were two
common kinds of inheritance in documents, but that Interdoc presently only
makes one of them easy; furthermore, he suggested a solution that I think has a
lot of appeal and fits into the Interdoc framework very nicely.

Interdoc makes it very natural for attributes to be inherited via the dominant
hierarchy (indeed, we have suggested that a strong heuristic for choosing the
dominant hiearchy is to look at the inheritance structure). This works very well
for attributes (e.g., margins) that are relevant across many levels of the hiearchy
and/or to a wide variety of node types, but not so well for attributes specific to
particular "classes" of nodes (e.g., line width and style).

The suggestion is very simple: The occurrence of T$ in a node not only adds T
to the set of tags of that node (formerly "marks"), it also includes T's value; i.e.,
it has the effect of T$T% in the current language.

Besides its simplicity, this idea has the following going for it:
-since the same name is used for both the tag and the included value, it should
greatly reduce the amount of name invention required;
-in the typical case, it will shorten the script; and
-it systematizes something that we found ourselves doing in an ad hoc way in
the examples.

The value associated with T can be established using the normal binding (and
quotation) features of Interdoc. Note that it can include bindings and/or content.
For tags defined in the standard, an initial value must be specified, even if it is
just Null.

Jim H.

*start*
02140 00024 USt
Date: 3 May 1982 12:10 pm PDT (Monday)
From: Horning.pa
Subject: Re: Document Model - framing
In-reply-to: gcurry.ES's message of 26-Apr-82 15:55:33 PDT (Monday)
To: gcurry.ES
cc: Interdoc

[I think this has been superceded by our discussion Friday; I transmit it in its
partially-completed form just for the record. -- Jim H.]

Gael,

Your message does a pretty good job of identifying issues that must be addressed.
The following is an attempt to provide my "going in" answers (where I have
them). It is intended in the same spirit of focussing the discussion (but I will not
qualify each suggestion with all the appropriate caveats).

"InterDoc is silent on rendering":

This standard card was developed in my absence; I'm glad to see some discussion
of what it does and does not mean, since I've been a bit uncomfortable with a
strong interpretation of it.

A prime requirement of Interdoc is that it must be able to transmit ANY
content/form x structure/value information that an editor considers to be part of
a document. For example, Bravo has two kinds of line breaks: those that result
from explicit CRs, and those that result from the next word not fitting on the
current line. I assume that a Bravo-like editor would transcribe the former into a
script, but not the latter. Any conforming editor that "understands" TEXT$ must
render these CRs as line breaks. However, since the second kind of breaks are
not intrinsic to the document, and not transcribed into the script, Interdoc must
be silent on how to compute implicit line breaks.

I think we have a couple of notions (rendition and presentation) slightly
confused. Scripts are something we interchange, documents are (the user's
abstract model of) the data structure manipulated by the editor. Masters are
sufficently tightly bound to produce a hardcopy or screen representation. I'd
prefer to use "rendering" for the script-to-document, and "presentation" for the
document-to-master (or screen, or page), translation.

\/		Script						/\
 |  Render					Transcribe	 |
\/		Document					/\
 |  Present					Infer		 |
\/		Master (display, page)			/\



*start*
01127 00024 USt
Date: 3 May 1982 2:22 pm PDT (Monday)
From: Horning.pa
Subject: Terminology
To: Interdoc

Nobody seems very pleased with Interdoc (or InterDoc) as the name for the
standard we are working on. The only suggestion I have heard that aroused some
enthusiam was Interscript. I have been mildly negative on this because I didn't
want to change "script" and wasn't too pleased with the prospect of repeatedly
using phrases like "an Interscript script."

However, I have hit on the following idea: Call the standard Interscript, and
DEFINE script to be "the representation of an editable document in the Interscript
language." Then we would always just say "script," rather than "an Interscript
script."

Comments, objections?

We also realized on Friday that there had been some confusion about the use of
the term "rendition." "Import/Export" was suggested as a replacement for
"Render/Transcribe," but that leads to connotation problems for Mesa
programmers. Would you believe
"Recover/Discover"?
"Invoke/Evoke"?
"Impress/Express"?
"Replicate/Transcribe"?
"Construe/Transcribe"?
[I was afraid not!]

Jim H.

*start*
00550 00024 USt
Date: 3 May 1982 9:26 pm PDT (Monday)
From: Mitchell.PA
Subject: Re: Terminology
In-reply-to: Horning's message of 3 May 1982 2:22 pm PDT (Monday)
To: Horning
cc: Interdoc

I think "Interscript" is a fine replacement for "Interdoc".  I propose that we call
the process of compiling a script into some document format "internalizing" (I)
the script, the process of producing a script from an internal document format
"externalizing" (E) the document, and the composition function E(I(script))
"transcription" (T).  

Jim M.  

*start*
01047 00024 USt
Mail-from: Arpanet host SUMEX-AIM rcvd at 4-MAY-82 0810-PDT
Mail-from: SU-NET host SU-SHASTA rcvd at 4-May-82 0805-PDT
Date: Tuesday,  4 May 1982 08:07-PDT
To: Interdoc at PARC-MAXC
Subject: Re: Terminology
In-reply-to: Your message of 3 May 1982 9:26 pm PDT (Monday).
From: Brian Reid <reid@Shasta at Sumex-Aim>

"Interscript" is certainly better than "Interdoc", though for some reason I
find the "rscr" hard to say.

I am not particularly thrilled with any of the words for the
operations. "Internalizing" and "Externalizing" have the least number
of semantic side effects, but I grew up in Washington DC and those
words really trigger my Bogometer. Too many syllables, and too
obviously artificial. It's ok if we don't need to use them much or
write them in manuals much.

I always thought of the function E(I(script)) as being called "editing".
I don't understand why it needs to be called something else. Agreed
that there can exist null edits, that simply ingest something and then
spit it out again.

Hmmph. 
Brian
*start*
01364 00024 USt
Date: 4 May 1982 11:31 PDT
From: Mitchell at PARC-MAXC
Subject: Re: Terminology
In-reply-to: reid@Shasta's message of Tuesday,  4 May 1982 08:07-PDT
To: Brian Reid <reid@Shasta at Sumex-Aim>
cc: Interdoc

Brian,

I don't know what a Bogometer is, but I quote the following definitions from the
American Heritage Dictionary:

internalize tr.v. 1. to make internal

externalize tr.v. 1. To make external; give external existence to.

These may be nouns that have been verbed, but I recall them being acceptable
words when I was in elementary school participating in spelling contests (and
that's a LONG time ago!).

E(I(script)) is nothing more than that.  Let's assume that (a particular session of)
editing can be described by a function Edit: then E(Edit(I(script))) includes
editing.  I didn't include Edit in the definition of "transcribe" because the
composition script'=E(I(script)) is interesting itself if we would like to say
something about the "equivalence" of script' and script.  It seemed natural to me
to want to say what properties of a script are preserved under transcription, so
that's why I called it that.

Remember: "the best is the enemy of the good" (A. Perlis).

Jim M.

P.S. "Hmmph" is not in the American Heritage dictionary, so I assume you meant

humph interj.  Used to express doubt, displeasure, or contempt.

*start*
01110 00024 USt
Date: 4 May 1982 15:20 CDT
From: Johnston.DLOS at PARC-MAXC
Subject: Re: Terminology
In-reply-to: Mitchell's message of 4 May 1982 11:31 PDT
To: Mitchell.PA
cc: Brian Reid <reid@Shasta at Sumex-Aim>, Interdoc.pa

Jim,

Accuracy at the expense of comprehensibility results in a degradation of net
communication.  I personally don't have any particular problem with Internalize
and Externalize except that while strictly accurate they may be, they are
somewhat symbolic (obscure) also, which is a problem the entire industry
already suffers from greatly.  Of course, they may be better (more
understandable) than transcribe and render.  What about encode and decode? 
These are somewhat understood in this context already, although your ideas as to
accuracy would be welcome.  I know it doesn't suggest that the script is a
"script," but I feel they're reasonable terms for the action taking place.

Following your comments about E(I(script)), I must agree with you, particularly
since equivalences of scripts are in question.  No editing is necessarily implied in
this function.

Rick


*start*
00906 00024 USt
Mail-from: Arpanet host SUMEX-AIM rcvd at 4-MAY-82 1343-PDT
Mail-from: SU-NET host SU-SHASTA rcvd at 4-May-82 1342-PDT
Date: Tuesday,  4 May 1982 13:44-PDT
To: Interdoc.PA at PARC-MAXC
Subject: Re: Terminology
In-reply-to: Your message of 4 May 1982 15:20 CDT.
From: Brian Reid <reid@Shasta at Sumex-Aim>

Mumble. I still consider E(I(script)) to be a degenerate special case
of E(edit(I(script))), and I fail to see the need for two names, implying
two separate concepts. The important properties are:
 (1) Idempotent equality:
	 E(I(E(I(script)))) == E(I(script))
 (2) Process equivalence:
	E(I(script)) equiv script
I claim that equality is a stronger notion than equivalence, and that
both of these properties have to be met by a conforming editor.

I suspect that I will have difficulty teaching my verbal reflexes to
say "Internalize" instead of "process into internal form".
*start*
02148 00024 USt
Date:  5-May-82 19:16:03 PDT (Wednesday)
From: Ayers.pa
Subject: Minutes of Interdoc Meeting of 30 April
To: Interdoc.pa

We began with a general discussion of form vs content.  Mitchell suggested that this was a spectrum, but with very little not at one end or the other; he suggested tabs and a few other format effectors as the middle.  This led into a tabs discussion and we had to back out after a bit ...

Scott suggested "Look Capitalized" as a touchstone; this proved fruitful.  Is it distinct from font choice?  People thought so.  Note that it can't be the same as a replacement of character codes, because it is a style.  After much discussion, we agreed that it is a legit look, akin to italic.

Are attributes back to "one damn thing after another"?  [This was sparked by the preceeding.]  Can we parameterize?  Brian suggested that it would help paragraph attributes to be able to say "First n lines style foo%"

Nte that there is another touchstone here: if you would like the paragraph "first line italic" but Interdoc doesn't support that OR your editor doesn't let you say it, then you will end up saying "first nine words italic", which may look ok now, but is relly quite a different thing.

A sub-group: Brian and Scott to report back on parameterization, especially as it relates to paragraph properties.

Some worry on what ARE form and content.  Note that these two words can be applied either to things in the Interdoc SCRIPT or to things in the DOCUMENT that the script describes.  The latter is more the user's view; they are not necessarily the same.

Brian suggested that the document might be considered to contain "textualk content" "structural content" "private data" and "styles".

Gael suggested that we need to come up with some names.  We agreed that a "property" was a thing like "bold ← TRUE" and that a "reference to a style" was a thing like "emphasis%" in an ordinary node and that a style" was a thing like "emphasis := "something".  It was suggested that the mesage system be used to suggest/vote on other words, including possible new words for "accept" and "emit" scripts.

Bob 
*start*
00972 00024 USt
Date: 13-May-82 17:40:40 PDT (Thursday)
From: Ayers.PA
Subject: Interscript Meeting Friday 14 May 9:00 Bayhill 100G
To: Interdoc (should we change the dl name?)

Nomenclature: we have embraced the following words:

  Interscript (replaces Interdoc).  And a "script" IS an Interscript script,
    so we don't have to use the latter phrase.

  Externalize and Internalize

  Tag (e.g. PARA$)

  Layer (i.e. "layer zero" and "layer one")

Last Friday's meeting discussed the document model .. perhaps prematurely.  I won't recap the (partly baked) discussion because we expect that it will be a little better baked tomorrow.

There were some comments about the rarity of persistent (:=) assignment.  We agreed that if there are just one or two particular needs for this construct [e.g. running figure numbers] then we would be better off by particularizing those few needs and eliminating persistent assignment from the general scheme of things.

Bob
*start*
01204 00024 USt
Date: 14 May 1982 12:32 pm PDT (Friday)
From: Mitchell.PA
Subject: semantics of Interscript without environments
To: Horning
cc: Mitchell

Here are the contents of my whiteboard.  Comments and questions follow them.

-------------------
X: (reduced) exp -- context

RR: x * e -> (reduced) exp	-- with node structure
R: x * e -> (reduced) exp
L: x * id -> exp		-- the Lookup function

C(R(x,e)) projects to contents
Lab(R(x,e)) projects to labels

RR(x, e) = append(x, R(x, e))
RR({item*}, e) = { item* R({item*}, e) }

R(x, Nil) = Nil
R(x, item item*) = R(RR(x, item), item*)
R(x, label) = label
R(x, {item*}) = {R(x, item*)}

R(x, id "←" rhs) = id←R(x, rhs)
R(x, id.name "←" rhs) = id←{inner(id%) name←R(x, rhs)}

R(x, literal) = literal
R(x, id) = R(x, L(x, id))

R(x, name "%") = name%

L(Nil, id) = Universal(id)
L({item* id' "←" rhs}, id) = IF id=id' THEN rhs ELSE L({item*}, id) -- eval ids!!
L({item* nonbindingItem}, id) = L({item*}, id)
L(x, id.name) = L(L(x, id), name)

L({item* name' "%"}, name) = L({item* R(item*, name')}, name)

C, Lab similarly
-------------------

I think we forgot to make L(x, id) or R(x, id) evaluate ids whose values are
quoted expressions.

*start*
02343 00024 USt
Date: 20-May-82 16:03:15 PDT (Thursday)
From: Ayers.PA
Subject: Meeting at Bayhill 100G at 9:00 21 May
To: InterDoc


Jim Mitchell and Jim Horning have been drafted for a two-three week project in CSL/Cedar.  I suggest that at this week's meeting we (whoever shows) try to work out how our various prototype implementations (820, Tajo, Star) are going to be developped and how they can lean on one another for parsers etc.


Notes from the meeting of 14 May:


We discussed the "document model" as it has been advanced so far.

In particular, we discussed the "extrinsic" vs "intrinsic" characteristics of text.  The model saparates these by making the extrinsic (i.e. the metrics of the text's container) things by defined in a (say) "LAYOUTSTYLE$" node while the intrinsic (e.g. font) things remain in the (say) "PARA$".  The two nodes can be connected by both being children of a new node, say "MARKUP$"

Several people had difficulties with this.  Their main point was that the "extrinsic" things are in fact few: say the set width and possibly left margin of a paragraph.  Initial-line-indent, for instance is not a geometrical metric because it depends on the point size of the font in use.

The "document model"lers agreed to try baking this a little more.


We also discussed what it means to "understand" a tag [Editor's note: this continues from an earlier discussion sparked by Horning -- several people thought it was fair to claim to "understand", say, a PARA$ without allowing the user to see or edit some attribute of a paragraph -- after all, if the editor correctly remembers the font, whether he lets the user see or change it is an editor issue.  Horning pointed out that an editor, by that reasoning, could claim to understand most any node, just stating that it happens to display and allow edits to none of the attributes.]:

To Understand a Tag:

  (Human Implementor) 1. Knows set of relevent attributes and contents
                      2. Knows all invariants that must be maintained

  (editor) 1. Is able to provide some rendition of node.
           2. Allows insert/delete of direct subnode.

To Implement a Tag:

  (Editor) 1. Understands tag
           2. Able to render it.
           3. Able to change it.

To Fully Implement a Tag:

  (Editor) Implement all attributes and content.

*start*
04976 00024 USt
Date: 9 May 1982 12:34 am PDT (Sunday)
From: Mitchell.PA
To: Horning.PA,GCurry.ES,McGregor.PA
Subject: Precis of Friday's meeting on Document Models
Categories: Save
cc: Interdoc.PA,Mitchell.PA

Friday afternoon, Jim Horning, Gael Curry, and I met to talk about "document
models".  I think we made some progress in understanding a few things in that
discussion, so here is a summary of it.

Gael suggested that we consider the "multiple coverings" idea in more detail.  We
soon mapped this notion into the following idea:

Consider the list C of actual contents of a document obtained by enumerating the
leaves of its abstract tree in left-to-right order.  Now for every kind of hierarchy
that makes sense for a document, one can have a tree that represents that
hierarchy and whose leaves are C  (In Gael's terminology, each of these trees is
a covering).  We can use multi-colored brackets in place of the single kind used
in Interdoc (or Lisp) to linearize (sorry, Brian) these multiple (chromatic?) trees
with their common, shared leaf structure.  Wonderful generality has been
attained, so let's list the various hierarchies of a document to see how often we
get to use it.

Certainly those two old favorites Chapter-SectionParagraph (CSP - logical
structure), and Containers-Lines (CL - geometric structure), are in the list.  How
about the various aspects of fonts (italic, bold, etc.) since the document is
structured into regions (is covered by) by what is or is not italic, is or is not
bold, etc? Well, this isn't such a good set of hierarchies for two reasons:

(1)	The hierarchies are all uninteresting (they each just describe contiguous
	regions and have no more depth), which does not allow us to use them
	for scoping and thereby gain some economy of expression.  Besides, we
	can readily indicate such regions in current Interdoc without resorting to
	extra hierarchies, e.g.
		{ <...> italic←T <...> italic←F <...> }

(2)	Moreover, what scoping we would like to have for font information has a
	strong tendency to follow the CSP hierarchy anyway for stylistic reasons.

So, font attributes don't seem like good candidates for being hierarchies in their
own rights.  We generated one more candidate hierarchy, "Star-like fields", but
decided that they, too, tended to follow the PSC hierarchy.  Thus we were left
with our original two hierarchies, PSC and CL.    

Then it was noticed that a good deal of the CL hierarchy actually follows the
PSC hierarchy as well, e.g., boundaries for paragraphs, frames for figures.  There
still seems to be some geometry that doesn't follow the PSC structure, e.g., page
margins.  So we are now down to 1.5 interesting colors in our chromatic tree. Not
much payoff for the complexity.

Then we discussed the notion of pages acting like frames and some of the Star
plans for having linked frames that text is formatted through (i.e., if there is text
still to be displayed when the first frame is full, continue displaying in the
second, etc.).  We found we could describe this by having a node tagged as
FRAMEDTEXT$ with an attribute that describes the frame structure and a list of
nodes to be displayed using that structure; e.g.,
	{FRAMEDTEXT$
		frames ← ([frame | h←4 w←4] [frame | h←2 w←5] ... )
		list-of-nodes-to-be-framed
	}

This seemed like such a good idea that we thought we ought to use it to describe
the entire document if it is to be viewed as having some page structure; e.g.,
	{PAGEDTEXT$
		pageLayout ← ([page | ...] [page | ...] ... )
		list-of-nodes-to-be-framed
	}

This model doesn't attempt to say where the page breaks are (just like specifying
the margins of a paragraph doesn't say where the actual line breaks are), but we
could probably handle that using links to say where all the page breaks are; e.g.,
	{PAGEDTEXT$  pb@! -- declare the link class
		pageLayout ← ([page | ...] [page | ...] ... )
		{PAGEBREAKS$ pb@} -- contents is set of places with label pb!
		{<...> pb! <...>}
			 ...
		}
	}

I am not sure that Gael and Jim H. agree, but I believe that the above says that
the notion of chromatic trees is not necessary or desirable for Interdoc and also
shows how to resolve the apparent conflict between the CSP and CL hierarchies.

This insight about document hierarchies led us to look harder at the uses for
persistent bindings (":=") in scripts, vis., chapter, section, page, figure, etc.
numbers and page headings.  We now believe that we can make the rule that all
persistent variables can be global to a document, which is a great simplyifying
assumption.  There is no need to declare persistent variables, the only difference
between them and local variables is that one uses ":=" instead of "←" when
assigning them.  No local variable can have the same name as a persistent
variable (since they are essentially global), and persistent variables disappear
from environment values (records), where they have always behaved a little
peculiarly.

Jim M.
*start*
01694 00024 USt
Date: 10-May-82 10:55:22 PDT (Monday)
From: Ayers.PA
To: Mitchell.PA
Subject: Re: Precis of Friday's meeting on Document Models
Categories: Save
In-Reply-To: Mitchell's message of 9 May 1982 12:34 am PDT (Sunday)
cc: Horning.PA,GCurry.ES,McGregor.PA,Interdoc.PA

I find the FRAMEDTEXT and PAGEDTEXT notions to be very intriguing.  The parallel drawn by "This model doesn't attempt to say where the page breaks are (just like specifying the margins of a paragraph doesn't say where the actual line breaks are)" is especially provocative.  I hope we hear more about this at the next meeting.

A note on the page break/line break parallel drawn in the above quote:

  The cases indeed seem very similar.  Upon reflection, I see one difference
  that I would like to hear discussed.  Case one: a galley of lines-in-paragraphs
  case two: a set of nodes-in-pages.

  In the first case, if you change margins, you do so at a paragraph end, and 
  there is partial line in the old style before the typography switches to the
  new margins.  That is, the line-length switch does not occur at an arbitrary
  place in the text, but at a particular stylized place.

  In the second case, however, the text is broken at an arbitrary line, with no
  (user defined) typographic stylization marking the switch.

Put another way: If you have the Curry text-flows-between-linked-blocks view, in the lines-within-pargraph-galley case charcters do NOT in fact flow from one line to another line OF DIFFERENT FORMATTING CARCTERISTICS.  But in the second case, characters DO flow from one page/block to another page/block of different formatting characteristics.

Is this note germane?

Bob
*start*
02128 00024 USt
Date: 11 May 1982 12:48 pm PDT (Tuesday)
From: Horning.PA
To: Mitchell.PA,Ayers.PA
Subject: Re: Precis of Friday's meeting on Document Models
Categories: Save
In-Reply-To: Mitchell's and Ayers' messages of 10-May-82 (Monday)
cc: Interdoc.PA

I think Jim's message captured the essence of the discussion (stripped of the
interesting but irrelevant side-issues).

The thing that suprised me most was his placement of the geometric structure in
the environment. It seems to me more natural to let FRAMEDTEXT$ and
PAGEDTEXT$ each have two content nodes, one for each contained hierarchy.
The basic insight seemed to be that it is better to place these hierarchies
side-by-side (perhaps with a few links) than to attempt superimposition by
intermingling. Then an editor that knows about only one can edit it freely. (One
can easily imagine separate editors for page layout and for galley preparation.)

Of course, within any style that forces certain correspondences between the two
hierarchies (e.g., each chapter starts on a new page), the parallel structures need
only be carried up to the known common level. (E.g., each CHAPTER$ node has
a geometric and a content node. But a BOOK$ node has only a sequence of
CHAPTER$ nodes.)

Since links point to nodes, it may be necessary to create dummy nodes to be the
target of the links that tie the two structures together. We should be careful not
to confuse "clues" that avoid the need for recomputation with user-specified
information ("force a page break here").

I'm not sure I completely understood Bob's question. The widows-and-orphans
issue indicates that placement of nodes in frames should not be completely
arbitrary, while a paragraph that runs from one page to another may change
line-lengths in the process. Surely the point in both cases is that individual
editors may have varying levels of aspiration, not that Interscript should
standardize these things. By more explicitly decoupling the geometric and
content information in a document, we give high-capability editors considerable
freedom in arranging the most satisfactory merger.

Jim H.

*start*
16832 00024 USt
Date: 12 May 1982 4:55 pm PDT (Wednesday)
From: Mitchell.PA
To: Interdoc.PA
Subject: Interdoc syntax and semantics
Categories: Save
Reply-To: Mitchell.PA


-------------------

The syntax has been brought up to date and the semantics have been updated to add some new features

(1) the meaning of a tag (formerly mark) includes evaluating the tag name as well so that its default bindings can be obtained simply by writing the tag (e.g., PARA$ places the tag PARA on the node and evaluates PARA%). 

(2) the notion of a scope (the "unit" that owns  an environment) has been added. 

-------------------


GRAMMAR

script    	::= versionId node
versionID	::= "Interscript/Interchange/1.0 "
content	::= term | node
term    	::= primary | primary op term
op		::= "+" | "-" | "*" | "/"
primary	::= literal | invocation | indirection | application | selection | vector
literal    	::= Boolean | integer | intSequence | real | string | universal
universal	::= ucID ( "." ucID )*
name    	::= id ( "." id )*
invocation	::= name
indirection	::= name "%"
application	::= ( name | universal ) "[" scope* "]"
selection	::= "(" term "|" item* "|" item* ")"
vector	::= "(" scope* ")"
node    	::= "{" nodeItem* "}"
nodeItem	::= label | scope
scope		::= binding* content content*
binding	::= name mode rhs
mode    	::= "←" | "=" | ":="
rhs    	::= content | op term | "'" item* "'" | "[" [ primary ] "|" binding* "]"
item    	::= label | binding | content
label    	::= tag | link
tag    	::= universal "$"
link    	::= id "@!" | name "@" | name "!"

NOTATION FOR ENVIRONMENTS

Environments bind identifiers to expressions, in various modes ("=", ":", ":=", "←"):
     Null denotes the "empty" environment
     [E | id m e] means "E with id mode m bound to e"
     locBinding(id, E) denotes the binding mode of id in E
         locBinding(id, Null) = None
         locBinding(id, [E | id' m e]) =
                 if id=id' then m else locBinding(id, E)
     locVal(id, E) denotes the value locally bound to id in E
         locVal(id, Null) = Nil = ""
         locVal(id, [E | id' m e]) = if id=id' then e else locVal(id, E)

SEMANTIC FUNCTIONS

R and B are intended to propagate the effects of the environment into an expression.

R: expression, environment --> expression                    -- Reduction
   R is used for evaluating right-hand sides: identifiers, expressions, etc.

B: expression, environment --> environment              -- Bindings
   B indicates the effect a binding has on an environment.  B and R are
   mutually recursive functions (e.g.,  the evaluation of an expression may
   cause some bindings to occur as well)

The following five functions all apply to expressions independent of environment and are intended to be used on the result of reducing an expression in an environment.

C: expression --> expression                                -- Contents
   C is basically used to indicate which evaluated expressions become part of the
   content of a node 

The following four semantic functions occur less frequently in any substantive
way in the semantics below.  You might wish to skip them until they occur in a
nontrivial manner in the semantics.

T: expression --> expression                                -- Tags
   T indicates when an identifier is to be included in the tag set for a node

L: expression --> expression                                -- Links
   L indicates link declarations

LF: expression --> expression                                -- Links From
   LF indicates a link to the set of nodes having associated target links

LT: expression --> expression                               -- Links To
   LT indicates that the node is to be included in the target set of all the names
   which are prefixes of the name to which the expression should evaluate

PRESENTATION BY FEATURE

[E is used to represent the value of the environment in which the feature
occurs.]

script ::= versionId node
     R = C = R<node>(EXTERNAL)
     B=EXTERNAL
     T = L = LF = LT = Nil
-- a script is evaluated in the pre-existing EXTERNAL environment common to all Interscript/Interchange/1.0 scripts

term ::= primary op term
op ::= "+" | "-" | "*" | "/"
     R = C = R<primary>(E) op R<term>(E)
     B = E
     T = L = LF = LT = Nil
-- Both the primary and the term must reduce to numbers; the arithmetic
operators are evaluated right-to-left (a la APL, without precedence) and bind less
tightly than application.

primary ::= literal
literal ::= Boolean | integer | intSequence | real | string | universal
     R = C = literal
     B = E
     T = L = LF = LT = Nil
-- The basic contents of a document.

universal ::= ucID
     R = C = ucID
     B = E
     T = L = LF = LT = Nil
-- universals (all upper case) are presumed to be directly meaningful, and are not looked up in the environment.

universal ::= universal "." ucID
     R = C = universal "." ucID
     B = E
     T = L = LF = LT = Nil
-- a qualified universal also just stands for itself

invocation ::= id
     R = R<valOf(id, E)>(E)
     B = B<valOf(id, E)>(E)
    where
     valOf(id, E) = CASE
         whereBound(id, E) = Null	=> MakeUniversal(id)
         whereBound(id, E) = Nil		=> Nil
         True					=> locVal(id, whereBound(id, E))
    and
     whereBound(id, E) = CASE                    -- Gets innermost binding
         locBinding(id, E) ~= None	=> E
         locBinding("Outer", E) ~= None	=> whereBound(id, locVal("Outer", E))
         E=EXTERNAL			=> Null
         True					=> Nil
-- Makeuniversal(id) produces the universal corresponding to id (in the current version its uppercase equivalent)
-- Both attributes and definitions are looked up in the current environment;
depending on the current binding of id, this may produce values and/or
bindings; if the binding's rhs was quoted, the expression is evaluated at the
point of invocation.
-- When id is referred to and locBinding(id, E)=None, then the value is sought recursively in locVal("Outer", E).  The outermost environment, EXTERNAL, binds each id to an universal which is the uppercase version of the id.  Otherwise, the value of the id is assumed to be Nil

invocation ::= name "." id 
     R = R<valOf(id, R<name>(E))>(E)
     B = B<valOf(id, R<name>(E))>(E)
-- Qualified names are treated as "nested" environments.

indirection ::= name "%" 
     R = R<valOf(id, R<name>(E))>(E)
     B = B<valOf(id, R<name>(E))>(E)
-- Indirection combines the facility for invocation plus recording the fact that the expansion resulted from evaluating a particular name (recording the indirection is not yet included in these semantics).

application ::= name "[" scope* "]"
     R = apply(name, R<scope*>(E), E)
     B = E
where
   apply(name, value*, E) =
       CASE R<name>(E) OF
          "EQUAL"	=> value1 = value2
          "GREATER"=> value1 > value2
          . . .
          "SUBSCRIPT"=> value1[value2]	-- value1: sequence, value2: int
          "CONTENTS"=> "(" C<inner(value1)> ")"
          "TAGS"	=> "(" T<inner(value1)> ")"    -- ?? this doesn't seem right
          "LINKS"	=> "(" L<inner(value1)> ")"
          "SOURCES"	=> "(" LF<inner(value1)> ")"
          "TARGETS"=> "(" LT<inner(value1)> ")"
          ELSE	=> R<name>([[Null | "Outer" "=" E] | "Value" "=" value*])
and where
   inner("{" value* "}") = value*
-- If the name does not evaluate to one of the standard external function
names, the current environment is augmented with a binding of the value of the
argument list to the identifier Value, and the value is the result of the
invocation in that environment; this allows function definition within the
language.

selection ::= "(" term "|" nodeItem1* "|" nodeItem2* ")"
     R = if R<term>(E) then R<nodeItem1*>(E) else R<nodeItem2*>(E)
     B = if R<term>(E) then B<nodeItem1*>(E) else B<nodeItem2*>(E)
-- The notation for selections (conditionals) is borrowed from Algol 68:
     ( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of nodeItems (including none). 

vector ::= "(" scope* ")"
     R = C = "(" R<scope*>(E) ")"
     B = B<scope*>(E)
     T = L = LF = LT = Nil
-- Parentheses group a sequence of values as a single, vector value; bindings in the sequence of scopes affect the environment of scopes to the right in the containing node, but labels are disallowed.  Parentheses may also be used to override the right-to-left evaluation of arithmetic operators; an operand sequence must reduce to a single numeric value. 

node ::= "{" nodeItem* "}"
     R = C = "{" R<"Sub$" nodeItem*>([Null | "Outer" "=" E]) "}"
     B = locVal("Outer", (B<"Sub" nodeItem*>([Null | "Outer" "=" E])))
     T = L = LF = LT = Nil
-- Nodes have nested environments and affect the containing environment only
through global (:=) bindings.  The nodeItems of a node are implicitly prefixed with the id Sub, which may be bound to any information intended to be common to all subnodes in a scope.

nodeItem* ::= ""
     R = C = T = L = LF = LT = Nil
     B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.


nodeItem* ::= binding* content1 content* 
     R = R<content*>(R<content1>(B<binding*>(E))
     B = B<content*>(B<content1>(B<binding*>(E))
     C = C<content*>(C<content1>(B<binding*>(E))
   For F in {T, L, LF, LT}:
     F = F<label> F<nodeItem*>

nodeItem* ::= label nodeItem*
     R = R<label>(E) R<nodeItem*>(B<label>(E))
     B = B<nodeItem*>(B<label>(E))
     C = Nil
   For F in {T, L, LF, LT}:
     F = F<label> F<nodeItem*>
-- In general, the value of a sequence of nodeItems is just the sequence of nodeItem values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence.

nodeItem* ::= scope nodeItem*
   For F in {R, B, C, T, L, LF, LT}
     F = F<scope>(E) F<nodeItem*>(B<scope>(E))
   For F in {C, T, L, LF, LT}:
     F = F<scope> F<nodeItem*>
-- In general, the value of a sequence of nodeItems is just the sequence of nodeItem values; binding items affect the environment of items to their right; Nil does not change the length of a result sequence.

item* ::= ""
     R = C = T = L = LF = LT = Nil
     B = E
-- The empty sequence of items has no value and no effect; this is the basis for
the following recursive definition.

item* ::= item1 item*
     R = R<item1>(E) R<item*>(B<item1>(E))
     B = B<item*>(B<item1>(E))
    For F in {C, T, L, LF, LT}:
     F = F<item1> F<item*>
-- In general, the value of a sequence of items is just the sequence of item
values; binding items affect the environment of items to their right; Nil does not
change the length of a result sequence.

binding ::= name mode rhs   -- how can we change this to create micro-scopes??
     R = Nil
     B = bind(name, mode, R<rhs>(E), E)
    where
     bind(id, mode, value, E) = CASE
         bindingOf(id, E) = "="	=> E            -- Can't rebind constants
         mode = ":="			=> assign(id, value, E)
         True				=> [E | id mode value]
     bind(id "." name, mode, value, E) =
         [E | id bindingOf(id, E) bind(name, mode, value, valOf(id, E))]
     bindingOf(id, E) = locBinding(id, whereBound(id, E))
     assign(id, value, E) = CASE
         locBinding(id, E) = ":"	=> [E | id ":" value]
         bindingOf(id, E) = ":"	=> bind("Outer." id, ":=", value, E)
         True				=> E             -- Can only assign to vars
-- This adds a single binding to E; bindings have no other "side effects" and no
value.

-- Each environment, E, initially contains only its "inherited" environment
(bound to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

binding ::= name mode op term
     R = Nil
     B = bind(name, mode, R<name op term>(E), E)
-- This is just a convenient piece of syntactic sugar for the common case of
updating a binding.

rhs ::= "'" item* "'"
     R = item*
-- If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.


rhs ::= "[|" binding* "]"
     R = [B<binding*>([Null | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that may be used much like a record.

rhs ::= "[" [ item* ] "|" binding* "]"
     R =[B<binding*>([R<item*>(E) | "Outer" "=" E]) | "Outer" "=" Null]
-- This creates a new environment value that is an extension of an existing one.


tag ::= universal "$"
     R = R<"default" "." universal "%">(E)
     B = B<"default" "." universal "%">(E)
     C = C<"default" "." universal "%">(E)
     T = universal
     L = LF = LT = Nil
-- This gives the containing node the property denoted by the tag named by the universal and also evaluates the indirection "default.universal%".

link ::= id "@!"
     R = id "@!"
     B = E
     L = id
     C = T = LF = LT = Nil
-- This defines the scope of the set of links whose "main" component is id.

-- A label N! on a node makes that node a "target" of the link N (and its
prefixes); a label N@ makes it a "source."  The "main" identifier of a link must
be declared (using id@!) at the root of a subtree containing all its sources and
targets.  The link represents a set of directed arcs, one from each of its sources to
each of its targets.  Multiple target labels make a node the target of multiple
links.  A target label that appears only on a single node places it in a singleton
set, i.e., identifies it uniquely.

link ::= name "@"
     R = name "@"    -- ?? why isn't R=Nil?
     B = E
     LF = name
     C = T = L = LT = Nil
-- This identifies the containing node as a "source" of the link name.

link ::= name "!"
     R = name "!"    -- ?? why isn't R=Nil?
     B = E
     LT = prefixes(name)
     C = T = L = LF = Nil
    where
     prefixes(id) = id
     prefixes(name "." id) = name "." id prefixes(name)
-- This identifies the containing node as a "target" of each of the links that is a
prefix of name.



NOTES

Each environment, E, initially contains only its "inherited" environment (bound
to the id Outer).  Most bindings take place directly in E.  To allow for
"persistent" bindings, the value of a bind(id, ":=", val, E) will change E by
rebinding id in the "innermost" environment (following the chain of Outers) in
which it is bound, if that binding has the binding ":" (Var).  Identifiers bound
with binding "=" (Const) may not be rebound in inner environments.

If the rhs of a binding is surrounded by single quotes, it will be evaluated in
the environments where the name is invoked, rather than the environment in
which the binding is made.

When an id is referred to and locBinding(id, E)=None, then the value is sought
recursively in locVal("Outer").  The (implicit) "outermost" environment binds
each id to the "universal" name formed by using the uppercase version of each character of id.

Nodes are delimited by brackets.  The contents of each node are implicitly
prefixed by Sub, which will generally be bound in the containing environment
to a quoted expression performing some bindings, and perhaps supplying some
labels (tags and links).

Parentheses are used to delimit sequence values.  Square brackets are used to
delimit the argument list of an operator application and to denote environment
constructors, which behave much like records.

Expressions involving the four infix ops (+, -, *, /) are evaluated right-to-left
(a la APL); since we expect expressions to be short, we have not imposed
precedence rules.

The notation for selections (conditionals) is borrowed from Algol 68:
     ( <test> | <true part> | <false part> )
This is consistent with our principles of using balanced brackets for compound
constructions and avoiding syntactically reserved words; the true part and false
part may each contain an arbitrary number of items (including none). 


A label N! on a node makes that node a "target" of the link N (and its prefixes);
a label N@ makes it a "source."  The "main" identifier of a link must be declared
(using id@!) at the root of a subtree containing all its sources and targets.  The
link represents a set of directed arcs, one from each of its sources to each of its
targets.  Multiple target labels make a node the target of multiple links.  A target
label that appears only on a single node places it in a singleton set, i.e.,
identifies it uniquely.

-------------------

*start*
01154 00024 USt
Date: 14 May 1982 2:58 pm PDT (Friday)
From: Mitchell.PA
To: Horning.PA
Subject: semantics of Interscript without environments
Categories: Save
cc: Mitchell.PA

Here are the contents of my whiteboard.  Comments and questions follow them.

-------------------
X: (reduced) exp -- context

RR: x * e -> (reduced) exp	-- with node structure
R: x * e -> (reduced) exp
L: x * id -> exp		-- the Lookup function

C(R(x,e)) projects to contents
Lab(R(x,e)) projects to labels

RR(x, e) = append(x, R(x, e))
RR({item*}, e) = { item* R({item*}, e) }

R(x, Nil) = Nil
R(x, item item*) = R(RR(x, item), item*)
R(x, label) = label
R(x, "'" item "'") = item
R(x, {item*}) = {R(x, item*)}

R(x, id "←" rhs) = id←R(x, rhs)
R(x, id.name "←" rhs) = id←{inner(id%) name←R(x, rhs)}

R(x, literal) = literal
R(x, id) = R(x, L(x, id))

R(x, name "%") = name%

L(Nil, id) = Universal(id)
L({item* id' "←" rhs}, id) = IF id=id' THEN rhs ELSE L({item*}, id) -- eval ids!!
L({item* nonbindingItem}, id) = L({item*}, id)
L(x, id.name) = L(L(x, id), name)

L({item* name' "%"}, name) = L({item* R(item*, name')}, name)

C, Lab similarly
-------------------

*start*
01322 00024 USt
Date: 14-May-82 17:40:56 PDT
From: Horning.PA
To: Mitchell.PA
Subject: Re: semantics of Interscript without environments
Categories: Save
cc: Horning.PA

I think that we want to call R and L something else; maybe Normalize and Value?
We can drop the curlies around a context.
There is no consistency in quoting terminals.

X: (normalized) exp			-- context

NC: x, e -> (reduced) item*      -- the Normalization into context function
N: x, e -> (reduced) exp	-- the Normalization function
V: x, id -> exp                -- the Lookup function

C(N(x, e)) projects to contents
L(N(x, e)) projects to labels

NC(item*, e) =  item* N(item*, e) 

N(x, Nil) = Nil
N(x, item item*) = N(NC(x, item), item*)
N(x, label) = label
N(x, {item*}) = {N(x, item*)}

N(x, id "←" e) = id "←" N(x, e)
N(x, id "←" "'" item* "'") = id "←" item*
N(x, id.name "←" rhs) = id "←" {inner(id "%") N(x, name "←" rhs)}

N(x, literal) = literal
N(x, id) = N(x, V(x, id))

N(x, name "%") = name "%"

V(Nil, id) = Universal(id)
V(item* id' "←" rhs, id) = IF id=id' THEN rhs ELSE V(item*, id) -- eval ids!!
V(item* {, id) = V(item*, id)			-- here's where we go to the containing scope
V(item* nonbindingItem, id) = V(item*, id)
V(x, e.id) = V(V(x, e), id)

V(item* name' "%", name) = V(item* N(item*, name'), name)

C, L similarly
*start*
02363 00024 USt
Date: 20-May-82 16:03:15 PDT (Thursday)
From: Ayers.PA
To: Interdoc.PA
Subject: Meeting at Bayhill 100G at 9:00 21 May
Categories: Save


Jim Mitchell and Jim Horning have been drafted for a two-three week project in CSL/Cedar.  I suggest that at this week's meeting we (whoever shows) try to work out how our various prototype implementations (820, Tajo, Star) are going to be developped and how they can lean on one another for parsers etc.


Notes from the meeting of 14 May:


We discussed the "document model" as it has been advanced so far.

In particular, we discussed the "extrinsic" vs "intrinsic" characteristics of text.  The model saparates these by making the extrinsic (i.e. the metrics of the text's container) things by defined in a (say) "LAYOUTSTYLE$" node while the intrinsic (e.g. font) things remain in the (say) "PARA$".  The two nodes can be connected by both being children of a new node, say "MARKUP$"

Several people had difficulties with this.  Their main point was that the "extrinsic" things are in fact few: say the set width and possibly left margin of a paragraph.  Initial-line-indent, for instance is not a geometrical metric because it depends on the point size of the font in use.

The "document model"lers agreed to try baking this a little more.


We also discussed what it means to "understand" a tag [Editor's note: this continues from an earlier discussion sparked by Horning -- several people thought it was fair to claim to "understand", say, a PARA$ without allowing the user to see or edit some attribute of a paragraph -- after all, if the editor correctly remembers the font, whether he lets the user see or change it is an editor issue.  Horning pointed out that an editor, by that reasoning, could claim to understand most any node, just stating that it happens to display and allow edits to none of the attributes.]:

To Understand a Tag:

  (Human Implementor) 1. Knows set of relevent attributes and contents
                      2. Knows all invariants that must be maintained

  (editor) 1. Is able to provide some rendition of node.
           2. Allows insert/delete of direct subnode.

To Implement a Tag:

  (Editor) 1. Understands tag
           2. Able to render it.
           3. Able to change it.

To Fully Implement a Tag:

  (Editor) Implement all attributes and content.

*start*
01302 00024 USt
Date: 26-May-82 11:15:26 PDT (Wednesday)
From: gcurry.ES
To: Mitchell.PA
Subject: Document Modeling
Categories: Save
cc: Horning.PA,McGregor.PA,GCurry.ES

Jim,

Shall we agree to postpone the document modeling meetings until your Cedar commitments dwindle a bit?  Or do you have enough time to continue the Friday meetings?  I can continue to work on the Star Document Model (document) until you are less busy; that would be one data point.

An interesting sidelight of last week's InterDoc meeting seemed to be an agreement to consider chained text in InterScript layer 2 (Bob may remember differently).  There is a large class of documents (namely, "running" forms) which are more graphics than text, in which text flows from one layout area to another, and which are important commercial applications for Star.  

I am also thinking back to the labeled coverings model of text and remember your assertion that that model was not as powerful as the one proposed by InterScript.  By that did you mean that, for example, it is awkward to represent multiple levels of section without hierarchy in the (labeled coverings) model?  I am not resurrecting that argument again; I am only considering adopting some form of that model in a standard string format to be proposed.

Gael