*start* 00534 00024 USt Date: 30-Dec-81 16:05:04 PST (Wednesday) From: ayers.PA Subject: InterDoc "Requirements" Document To: Mitchell, Horning, McGregor, Karlton, GCurry.ES, Reid Reply-To: Ayers cc: Ayers, Dalal, Irby, Beeson.ES A preliminary version of the subject document is out on (Star file system) McKinley > InterD Ayers > InterDoc Req'ments Spec Comments appreciated. Can the PARCers get Star stuff? If not, please forward the above info to Jeri-Ann <Callander> and ask for a printed copy thru the mail (bletch). Bob *start* 02654 00024 USt Date: 6 Jan. 1982 10:46 pm PST (Wednesday) From: Mitchell.PA Subject: How to present the Interdoc transcription semantics - Version 1.0 To: Horning, Guttag, Donahue cc: Mitchell Here are the contents of my whiteboard after today's meeting, with extra comments thrown in as my memory dictates and as my stamina allows tonight: Literals Boolean, string, int, hex int, real { literal } the simplest documents, e.g., {} -- an empty document {NIL} -- also an empty document {<>} -- the empty string=NIL, so this is also empty {<This is the sole content of this document>} {TRUE} { literal* } contents in left-to-right order, e.g., {<This> < is > <the> < sole > <content> < of> < this> < document>} {<This> < is > <the> < sole > <content> NIL <of> < this> < document> NIL} -- NILs disappear {TRUE TRUE TRUE} {2.7 3.2 7.5} -- a coordinate sequence, perhaps { (literal | node)* } nested nodes are used to represent the dominant hierarchical structure in a document, e.g., { {<This>} {< is >} {<the> < sole > <content>} {< of> {< this> < document>} } } Marks means of attaching one or more properties to a node marks provide a node's "aspects", e.g., {$Sentence {$Subject $Pronoun <This>} {$Verb < is >} {$Object $NounPhrase <the> < sole > <content> {$AdjectivalPhrase < of> < this> < document>} } } {$Cartesian 2.7 3.2 7.5} -- a coordinate sequence Nodes and links a means of denoting associations among nodes other than those implied by the dominant hierarchy introduce Contents,Links,Marks,Sources,Targets functions Examples: {$Sentence theSubject@! theObject@! {$Subject $Pronoun <This> theSubject@} {$Verb theSubject! < is > theObject!} {$Object $NounPhrase <the> < sole > <content> theObject@ {$AdjectivalPhrase < of> < this> < document>} } } Variables in the global environment and nodes with variable invocations there is a global, pre-existing environment with certain names defined; these variables can be named in the script to obtain their values as contents Examples: { ... <Signed this > DATE < in the county of ...> ... } Simple bindings in the global environment (e.g., LeftMargin := 10) Declared VARs at the root node only Nested environments and VARs introduce identifier lookup rule introduce R, B Decorated tree as the semantic model Local bindings and const Simple expressions: + * / Conditionals Environments as values qualified names, e.g., a.b ← foo invocation Quoted expressions invocation Node values Function application built-in functions: LT, EQ, ... *start* 01087 00024 USt Date: 18-Jan-82 14:31:01 PST (Monday) From: Ayers.PA Subject: Interdoc Technical Working Group Status To: Interdoc Reply-To: Ayers I apoligize for the delay in getting back to you-all after the meeting of a week ago. I fell a victim of a hack attack (thanks, Brian, for that phrase) and it used up the better part of four days. The nature of the attack will be apparent from the next paragraph. The current Interdoc "System Requirements" and "Project Description" documents, which were produced in Star (my only editor) are now available as press files on [iris] as <Ayers>starPress>*.press. These two documents are basically the same, and the "Project Description" is a little newer. I would appreciate any comments you care to make. As I produce more Interdoc-related documents (a Project Plan, with manpower, schedules, etc. is in the works) I will put press copies in [iris]<Ayers>starPress> for your convenience. A summary of the meeting of last Monday will follow in another message. [I've been warned against composing long Hardy messages.] Bob *start* 01721 00024 USt Date: 18-Jan-82 14:55:24 PST (Monday) From: Ayers.PA Subject: Interdoc TWG Meeting of 11jan82 -- Main Thread To: Interdoc Reply-To: Ayers Ayers began the meeting by starting to list "issues" preparatory to discussions of them. As expected, this didn't work, because a brief description of the "what is the dominent heirarchy" issue got us talking about it. This turned out to be fine, because we reached a basic agreement that a page-oriented physical block model for the dominent heirarchy [DH] isn't appropriate for an interchange vehicle. Some of our points: >> If you make the DH character/line/block/page, say, then all the world's editors have to know what a "line" is in the face of edits. This is unreasonable considering fonts.widths, hyphenation, etc. >> The editing community, today, is structured around character and paragraph looks. >> If you want to send the image-form of a document, you should use Interpress. Gael said that his only problems with this for Star were about efficiency. We discussed efficiency considerations a bit. It appears that accelerators [distinguish from a 'hint' -- an accelerator is a value that is redundant and can be reconstructed from the basic data, and, if present, is correct] can provide Star with the means to achieve reasonable efficiency when a Star system accepts an Interdoc script produced by another Star system. We decided, based on the above, to go with a character/paragraph DH and explore what that means. Action item: each member should send in (to the dl) his trial list of "marks" in the DH, and we'll see what we've got. [A list of miscellaneous points will follow in my next message.] Bob *start* 01136 00024 USt Date: 18-Jan-82 15:03:29 PST (Monday) From: Ayers.PA Subject: Interdoc TWG Meeting of 11jan82 -- Other Points To: Interdoc Reply-To: Ayers 1. Accelerators can be pruned (when you make a change) by either a) actually deleting them or b) re-time-stamping the Interdoc script so they will be seen to be obsolete. What are the tradeoffs? 2. If you make a local change, what is the region of the script where you have to invalidate accelerators? 3. "Page styles" are hidden at a high level. Excatly how? 4. Phil reminds us that we should develop a low-level editor as well as a Star editor in order to test our standard. He suggests (and volunteers for?) an 820 text editor. 5. Do we want a "well-formedness" checker for Interdoc scripts? 6. We need unique names -- e.g. for private node names. Do we use a heirarchical naming authority a la Interpress? 7. We believe in the current [Jim&Jim] arrangements for applying character looks to characters within text .. block structure suffices, we don't need overlapping runs etc. Additional points? Send them to me and I'll put out a collection. Bob *start* 00617 00024 USt Date: 18 Jan. 1982 5:10 pm PST (Monday) From: Horning.pa Subject: Re: Interdoc TWG Meeting of 11jan82 -- Other Points In-reply-to: Your message of 18-Jan-82 15:03:29 PST (Monday) To: Ayers cc: Interdoc ["heirarchical" should be "hierarchical"] 8. Ayers would like to see a convincing example that "styles really work." 9. We decided not to make provision for private encodings of scripts. 10. Mitchell will work on a tutorial intended to make the formal semantics more palatable. 11. The issues of "meta-properties" and subclassing reared their heads again (with no further resolution). *start* 00286 00024 USt Date: 19-Jan-82 11:44:15 PST (Tuesday) From: Ayers.PA Subject: New(er) InterDoc Project Description can be found .. To: InterDoc .. in the Star file drawer "InterD Ayers" on McKinley and as [iris]<Ayers>starPress>InterdocProjectDescription.press as usual. *start* 00721 00024 USt Date: 19 Jan. 1982 1:21 pm PST (Tuesday) From: Horning.pa Subject: Re: New(er) InterDoc Project Description can be found .. In-reply-to: Ayers' message of 19-Jan-82 11:44:15 PST (Tuesday) To: Ayers cc: InterDoc Bob, Finally found my copy--filed, of course, under G. Do you suppose that you could arrange courtesy accounts on [iris] for the members of InterDoc? Also, the text justification on Clover is TERRIBLE. I trust that this is an artifact of the "hack attack" and not characteristic of Star documents? The content looks pretty good. Section 1.2 "many more such InterDoc editors" -> "many more such InterDoc-compatible editors" Section 2.2 "e.g. the" -> "e.g., the" [twice] Jim H. *start* 02898 00024 USt Date: 8 Jan. 1982 3:39 pm PST (Friday) From: Mitchell.PA Subject: How to present the Interdoc transcription semantics - Version 1.0 To: Horning, Guttag, Donahue cc: Mitchell Here are the contents of my whiteboard after today's meeting, with extra comments thrown in as my memory dictates and as my stamina allows tonight: Literals Boolean, string, int, hex int, real { literal } the simplest documents, e.g., {} -- an empty document {NIL} -- also an empty document {<>} -- the empty string=NIL, so this is also empty {<This is the sole content of this document>} {TRUE} { literal* } contents in left-to-right order, e.g., {<This> < is > <the> < sole > <content> < of> < this> < document>} {<This> < is > <the> < sole > <content> NIL <of> < this> < document> NIL} -- NILs disappear {TRUE TRUE TRUE} {2.7 3.2 7.5} -- a coordinate sequence, perhaps { (literal | node)* } nested nodes are used to represent the dominant hierarchical structure in a document, e.g., { {<This>} {< is >} {<the> < sole > <content>} {< of> {< this> < document>} } } Marks means of attaching one or more properties to a node marks provide a node's "aspects", e.g., {$Sentence {$Subject $Pronoun <This>} {$Verb < is >} {$Object $NounPhrase <the> < sole > <content> {$AdjectivalPhrase < of> < this> < document>} } } {$Cartesian 2.7 3.2 7.5} -- a coordinate sequence Nodes and links a means of denoting associations among nodes other than those implied by the dominant hierarchy a means of marking places in a script (e.g., page boundaries, indexable items, references to figures, etc. introduce Contents,Links,Marks,Sources,Targets functions Examples: {$Sentence theSubject@! theObject@! {$Subject $Pronoun <This> theSubject@} {$Verb theSubject! < is > theObject!} {$Object $NounPhrase <the> < sole > <content> theObject@ {$AdjectivalPhrase < of> < this> < document>} } } Variables in the global environment and nodes with variable invocations there is a global, pre-existing environment with certain names defined; these variables can be named in the script to obtain their values as contents Examples: { ... <Signed this > DATE < in the county of ...> ... } Simple bindings in the global environment (e.g., LeftMargin := 10) affects contents to its right or until LeftMargin changed again; content value is NIL Declared VARs at the root node only can introduce new variables by name: initialValue Nested environments and VARs introduce identifier lookup rule introduce R, B Decorated tree as the semantic model Local bindings and const Simple expressions: + * / Conditionals Environments as values qualified names, e.g., a.b ← foo invocation Quoted expressions invocation Node values Function application built-in functions: LT, EQ, ... *start* 01209 00024 USt Date: 28-Jan-82 11:19:15 PST (Thursday) From: Ayers.PA Subject: Paragraph Properties Discussion To: InterDoc Assume that we define the "para$" node which corresponds to the Star/Bravo notion of the paragraph. Now this node has associated with it many "properties". What are these properties in InterDoc? 1. The property can be a value directly associated with the para$ node e.g. {para$ ... hangingIndent←6 ...} 2. The property can be represented as an imbedded node e.g. {para$ ... {indents$ hanging←6} ...} or even {para$ ... {hangingIndent$ 6} ...} Claim: the way things currently sit, method one requires a simple editor to "understand" indentation before it can correct spelling in the paragraph's text, while method two allows the simple editor to correct spelling without knowing about indenting. The latter is clearly desirable, since, based on preliminary surveys, we're going to have a lot of fairly-obscure paragraph properties. Is there a way we can preserve the straight-forwardness of method one and yet allow for simple editing? It is "clear" that most pargraph properties can be safely ignored by the simple editor. How can we best capture this fact? Bob *start* 00279 00024 USt Date: 28 Jan. 1982 11:27 am PST (Thursday) From: McGregor.PA Subject: Re: Paragraph Properties Discussion In-reply-to: Ayers' message of 28-Jan-82 11:19:15 PST (Thursday) To: Ayers cc: InterDoc Why, this sounds like a job for meta-properties... Scott. *start* 01541 00024 USt Date: 4 Feb. 1982 9:53 am PST (Thursday) From: Horning.pa Subject: Re: Paragraph Properties Discussion In-reply-to: Ayers' message of 28-Jan-82 11:19:15 PST (Thursday) To: Ayers cc: InterDoc Bob, I haven't been ignoring the question, I've been thinking! The safety rules given on page 26 of "Towards an Interchange Standard . . ." distinguish between "properties" (denoted by marks) and "attributes" (denoted by bindings). They do not require that you understand all relevant attributes to edit a node, only that you understand all its properties. Presumably they should be augmented to say that you only change the bindings of attributes that you understand. The question of attributes becomes an issue when moving a node from one environment to another, which leads to the most complex safety rule: It's OK to copy a node if you understand ALL properties of its new parent, no labels are moved outside their scope, and the two environments have the same bindings for all attributes that you don't either understand, or know can't be relevant anywhere in the node or its subnodes. Thus, I claim that a low-capability editor could edit the content using either your style 1 or style 2. What it might have to do in style 1 is check when moving a para$ node that the hangingIndent attribute wasn't changed between the two environments (or insert an explicit binding--it doesn't need to understand the meaning of the attribute to copy its value). I don't think we need meta-attributes yet. Jim H. *start* 01145 00024 USt Date: 4-Feb-82 14:19:53 PST (Thursday) From: Ayers.PA Subject: Re: "Attributes" vs "Properties" In-reply-to: Horning's message of 4 Feb. 1982 9:53 am PST (Thursday) To: Horning cc: InterDoc Hmmm. Let's look at this "attributes" vs "properties" issue some more. I have the feeling that I'm being led somewhere I don't want to go. You don't have to "understand all relevant attributes to edit a node, only ... understand all its properties" In some sense, this DEFINES what attributes are. [The syntax does that too, of course, but I'm only speaking semantically]. Consider the "attribute" 'fogIndex' which I declare is the readibility index of the text in a paragraph node. Now I can create this "attribute" in the syntax, and can assign fogIndex←17 somewhere, but it isn't 'really' an "attribute" because if an editor doesn't understand the semantics of this "attribute" it can't safely edit the text in a paragraph and maintain the invariant. So what's wrong here? 1. You do have to 'understand' the attributes. 2. Attributes cannot be used to force invariants on node contents. 3. Something else. Bob *start* 00661 00024 USt Date: 5 Feb. 1982 2:59 pm PST (Friday) From: Horning.pa Subject: Re: "Attributes" vs "Properties" In-reply-to: Ayers' message of 4-Feb-82 14:19:53 PST (Thursday) To: Ayers cc: InterDoc Bob, 2. Attributes cannot be used to force invariants on node contents. If attributes are not (very nearly) independent variables, then there is no hope of changing anything without understanding its interactions with everything else. Unless you can give me an algorithm for taking a content and rendering it so that its Fog Index is 17, then I claim that fogIndex←17 had better not be a binding of a relevant attribute (maybe of a hint?). Jim H. *start* 00845 00024 USt Date: 25-Jan-82 15:06:17 PST (Monday) From: Ayers.PA Subject: InterDoc script vs Printing directions To: InterDoc Given: we have some page-formatting info in the InterDoc script. I imagine the sort of info that's attached to the Star PFC 'character.' Claim one: That stuff should be collected in (one or more) 'PFC' Nodes in the script. It must be possible to imbed a PFC Node within a Paragraph Node. Claim two: At some point, rendering information stops becoming info that belongs in an InterDoc script. To prove this, I offer a spectrum of rendering data items: page margins and columns page size (8.5x14) binding offsets three-hole-ness paper stock color and weight duplex printing 'printedby' name Does anyone thnik that ALL these things should be specify-able in an InterDoc script? Bob *start* 02416 00024 USt Date: 25 Jan. 1982 4:20 pm PST (Monday) From: Horning.pa Subject: Technical Notes To: Interdoc Just a couple of thoughts since last Thursday: 1) The requirements on a "conforming" simple editor are slightly stronger than the ones Jim M. mentioned Thursday. Consider {Section$ {Para$ <T1>}{Fig$ xxx {Text$ <Caption>} yyy}{Para$ <T2>}} and an editor that understood all the marks but Fig. Now it does not suffice for it to remember that "{Fig$ xxx" goes in front of the caption and "yyy}" follows it. It must also remember that the two are related--specifically that they open and close the node in which the caption is embedded. It would not do to move the latter in front of the former, for example. Thus, even simple editors will have to "understand" nested structures. 2) The reason for preferring tree structures to more general directed graph structures is that we have a nice linear notation for trees (in fact, several!). We can resolve a few dilemmas by noting that--as long as the "fringe" is the same-- we can use a simple linear notation to impose SEVERAL tree structures on the same fringe. (There is a whole branch of graph theory devoted to "parenthesis bracket structures," but that would take us too far afield.) We merely need some way of "coloring" structural delimiters, so that looking at each one separately (and temporarily ignoring the others) gives us one of the structures. Of course, there is no reason why the various trees can't be constructed/interpreted concurrently. Uses: -This avoids the need to give quite such a distinguished status to the Dominant Hierarchy. There is no reason not to record BOTH the content structure and the geometric structure of a document. -This avoids the need to create artificial nodes to represent the intersections of overlapping ranges of attributes such as "italic" and "size=12" (the sort of thing Bravo represents by run codes). In fact, we might even want to use a Polish notation for secondary hierarchies ("look italic 17" could mean that the next 17 nodes in the "look hierarchy" have the italic property). Problems: -This raises the complexity of the base language that even the simplest editor must be able to process. -It provides little assistance in going between editors that each have a single Dominant Hierarchy, but not the same one. Clearly, there's room for some more research in these areas. Jim H. *start* 00861 00024 USt Date: 26-Jan-82 15:44:54 PST (Tuesday) From: Ayers.PA Subject: Re: Technical Notes In-reply-to: Horning's message of 25 Jan. 1982 4:20 pm PST (Monday) To: Horning cc: Interdoc I missed something. I always viewed the "content" of a node as including the nodes that were directly imbedded, tho not their content. With that view, it is clear that the non-fig-understanding editor can't modify {Fig$ xxx {Text$ <Caption>} yyy} to be {Fig$ {Text$ <Caption>} xxx yyy} [or whatever i don't exactly understand your straw-man edit] since that is juggling (modifying) the content of the Fig$ even if the xxx and yyy are nodes themselves. I've always thought that the non-fig-understanding editor could (only) edit the content of imbedded guys that it undestood .. e.g. alter the original to {Fig$ xxx {Text$ <cat dog>} yyy} Bob *start* 00633 00024 USt Date: 26-Jan-82 15:52:25 PST (Tuesday) From: Ayers.PA Subject: Re: Technical Notes In-reply-to: Horning's message of 25 Jan. 1982 4:20 pm PST (Monday) To: Horning cc: Interdoc Jim, in an effort to undestand your "some way of "coloring" structural delimiters" approach, I created the following (a la bravo) example. It claims to represent the sentence "The quick brown fox jumped." where "quick brown" is italic and "brown fox" is bold. [Thus "brown" is bold italic.] Is it fair? (I use three shapes of paren for the 'colors.') {Text$ <The > (Italic$ <quick > [Bold$ <brown>) < fox> ] < jumped.> } Bob *start* 01390 00024 USt Date: 26 Jan. 1982 4:06 pm PST (Tuesday) From: Horning.pa Subject: Re: Technical Notes In-reply-to: Ayers' message of 26-Jan-82 15:44:54 PST (Tuesday) To: Ayers cc: Interdoc Bob, Maybe I was the one who missed something. What I thought Jim Mitchell was suggesting Thursday was: 1) Obviously, any conforming editor must be able to keep around unchanged some chunks of information that it doesn't understand. 2) For the sanity of the human doing the edit, it might be a good idea to indicate such chunks embedded in text that he can edit by some kind of smudge. 3) As long as these chunks get moved around with their smudges, the human can move things around pretty freely without worrying too much about the consequences. In my example, the human might see T1 Caption T2 Now, it's OK for him to perform edits that lead to T1abc Caption T2def or T1 Dopgion T2 But we might expect trouble from tion T2 T1 Cap I agree that the rules about what nodes can and cannot safely be edited cover this case; what I had hoped to come out of the rules Jim was suggesting was some way that a dumb editor didn't need to know anything but "leave the chunks alone, dummy." But I now think it has to know about at least the node structure in the chunks, and ensure that (the net effect of) a series of edits doesn't violate it. T1 {{ Caption }} T2 Jim H. *start* 00282 00024 USt Date: 26 Jan. 1982 4:07 pm PST (Tuesday) From: Horning.pa Subject: Re: Technical Notes - "coloring" In-reply-to: Ayers' message of 26-Jan-82 15:52:25 PST (Tuesday) To: Ayers cc: Interdoc Bob, That's the idea. I don't guarantee that it's a winner. Jim H.