*start* 01814 00024 USt Date: 2-Aug-82 18:25:03 PDT From: Horning.pa Subject: Lampson on Interscript To: Interscript cc: Lampson, Horning.pa This is to record a few notes of a conversation I had with Butler last week about the rate of progress on Interscript and the reasons therefore. He can correct me if I misrepresent him. Comparing our progress with Interpress history, Butler is somewhat dubious that we will have a standard ready for adoption within a year. Butler compared our recent strategy to tree search, depth-first, hardest-subproblem-first. He thinks we should instead be using breadth-first. I.e., complete all the parts of a standard for simple interchange, then flesh that out to deal with harder problems, like geometric model conversion. When I explained our recent discussion on how much of the document model to formalize, Butler supported a position close to the "standardize only on the spelling of the identifiers" end of the spectrum. He recognized that this might lead to Star and Tioga giving different renderings to TopLeading. ("They're going to anyhow, regardless of what the standard says.") He did suggest one improvement: Each conforming editor's documentation must specify its interpretation of each of the standard attributes. (Its effect on display and rendition, at least.) He is of the "round trip" school that feels it is more important for a document to come back to the original editor uncompromised than that it be well approximated by a foreign editor. He suggested that any standard approximations necessary to move from one document model to another be script-to-script transformations (e.g., converting a script with glue to one with leading or vice versa), rather than part of what the standard says about internalization and externalization. Jim H. *start* 00471 00024 USt Date: 3-Aug-82 11:35:41 PDT (Tuesday) From: deLaBeaujardiere.pa Subject: Re: Lampson on Interscript In-reply-to: Horning's message of 2-Aug-82 18:25:03 PDT To: Horning cc: Interscript, Lampson 1. "He is of the "round trip" school"... Three cheers! 2."...any standard approximations necessary to move from one document model to another be script-to-script transformations..." Could you explain this to me a little more? Thanks. Jean-Marie *start* 01133 00024 USt Date: 7 Aug. 1982 2:12 pm PDT (Saturday) From: Horning.pa Subject: Re: Lampson on Interscript In-reply-to: deLaBeaujardiere's message of 3-Aug-82 11:35:41 PDT (Tuesday) To: deLaBeaujardiere cc: Interscript Jean-Marie, "...any standard approximations" Suppose that we have a script (perhaps written by Tioga) that has "topGlue" (a three-component vector), rather than "topLeading" (a single number). The standard approximation of topLeading in terms of topGlue is to extract the first component of the vector. The standard approximation of topGlue in terms of topLeading is to create the vector [topLeading, 0, 0]. Since these are operations on attributes appearing in scripts, we could imagine a program quite separate from both Star and Tioga that transformed between scripts most appropriate for each. However, if Star got a script that had NOT been so transformed, it would merely preserve topGlue (since it doesn't understand it) and default topLeading (since the script doesn't specify it). Then topGlue would get back to Tioga intact (for the unmodified portions, anyhow). Does that help? Jim H. *start* 01098 00024 USt Date: 2-Aug-82 19:11:52 PDT (Monday) From: Ayers.PA Subject: Re: Lampson on Interscript In-reply-to: Horning's message of 2-Aug-82 18:25:03 PDT To: Horning cc: Lampson, Interscript I am getting nervous too. Particularly in the "document model" area. I think that Butler's second and third points [use breadth-first; standardize only spelling] support the idea, floated last month, that the document model gang should produce a "straw man" proposal early on. I've been cheered by the galley/layout and content/geometry dichotomies that have been described, but the absence of a concrete proposal to knock around [due to Cedar/WIG/vacation committments?] continues to worry me. My current scheduling claims to Irby and SDD [via my June and July status reports] are this: 1. Interscript is in good enough shape by year-end so it is reasonable that Star and the 820 (and Tajo and Tioga) could begin to use it, realizing that it isn't yet a standard. [Much as the 8044 used Interpress] 2. A standard on text and "document model" layout next June. Bob *start* 00439 00024 USt Date: 2-Aug-82 10:21:54 PDT (Monday) From: Ayers.PA Subject: I need a word for .. To: Interscript .. the thing defined by a "{}" pair. We used to call the {...} thing a "node" but we didn't like that. I tried calling it a document "construct" (as in "a script consists of a hierarchical arrangement of document constructs, such as paragraphs and barcharts ..") but that doesn't seem to be working well. Ideas? *start* 00239 00024 USt Date: 2 Aug. 1982 11:10 am PDT (Monday) From: Horning.pa Subject: Re: I need a word for .. In-reply-to: Ayers' message of 2-Aug-82 10:21:54 PDT (Monday) To: Ayers cc: Interscript Unit? Block? Chunk? Scope? Group? *start* 00370 00024 USt Date: 2-Aug-82 10:49:50 PDT (Monday) From: GCurry.ES Subject: Re: I need a word for .. In-reply-to: Ayers.PA's message of 2-Aug-82 10:21:54 PDT (Monday) To: Ayers.PA cc: Interscript.PA How about "scope" for the general category "{}" and "domain" or "object" or "node" or "construct" for those scopes that correspond to document PARTS? Gael *start* 00786 00024 USt Date: 3-Aug-82 11:49:51 PDT (Tuesday) From: Ayers.PA Subject: More Words ... To: Interscript In writing up the "Concepts and Facilities", I have a problem with our current usage of the word "content" wrt nodes. We have been considering that a node has, between the "{" and the "}", several things: flags, content, and assignments, where "content" includes both raw data (e.g. "<string>") and subnodes. This turns out to be an awkward usage of "content". I would like to define: content: all the stuff between the "{" and the "}" It includes: data: the simple mentions of variables and constants subnodes tags [If a distinction is necessary, the above applies after abbreviations and styles are expanded out.] Comments? Problems? Bob *start* 00360 00024 USt Date: 3 Aug. 1982 3:42 pm PDT (Tuesday) From: Horning.pa Subject: Re: More Words ... In-reply-to: Ayers' message of 3-Aug-82 11:49:51 PDT (Tuesday) To: Ayers cc: Interscript We need words for all those things. However, "data" has a more inclusive connotation, at least if not qualified, e.g., "direct data," "immediate data." Jim H. *start* 01095 00024 USt Date: 8-Aug-82 14:04:30 PDT (Sunday) From: Ayers.pa Subject: Let's do away with the "SUB" construct for tagging subnodes To: Interscript The syntax currently defines that you can assign to "SUB": SUB ← "TEXT$ value ← 2" and the meaning is that, within the scope of the assignment, each "{" is effectively suffixed with the value assigned to "SUB" This was viewed as an effective way of tagging a whole series of similar nodes without stating the tag over and over. I claim that we can do away with this item of syntax. Reasoning: 1. It is confusing, since you have to look back at "SUB" to find out what is happening. 2. It creates this special identifier "SUB" and we don't want special things in the syntax. 3. Any usage like the above, to save characters in the script, can be acheived (almost as compactly) via abbreviation: z ← "TEXT$ value ← 2" ..... { z ..... with a one-letter abbreviation name, this is about as compact as the SUB usage, and avoids a special syntactic construct. Bob *start* 00786 00024 USt Date: 9 Aug. 1982 10:07 am PDT (Monday) From: Horning.pa Subject: Re: Let's do away with the "SUB" construct for tagging subnodes In-reply-to: Ayers' message of 8-Aug-82 14:04:30 PDT (Sunday) To: Ayers cc: Interscript Bob, I agree with all your objections. Nonetheless . . . It would be very nice to be able to define nodes all of whose direct subnodes were of a certain form. (This is the only way I can think of not to provide a special "sugar" for a string node, each of whose elements is a character node. Relative to a character, even an abbreviation can be expensive.) Let's think about it before discarding. Jim H. P.S. I am very encouraged to see someone going over the languag carefully enough to be coming up with this sort of suggestion. *start* 00585 00024 USt Date: 9 Aug. 1982 10:11 am PDT (Monday) From: Mitchell.PA Subject: Re: Let's do away with the "SUB" construct for tagging subnodes In-reply-to: Ayers' message of 8-Aug-82 14:04:30 PDT (Sunday) To: Ayers cc: Interscript I believe you are probably right about SUB. It basically "came for free" in the semantics, so we put it in, but I have had the problems similar to the ones you mention in that I really often wanted to place two or three things in a subnode (e.g., at the beginning and at the end, or between other items). I vote we eliminate SUB. Jim M. *start* 00389 00024 USt Date: 11-Aug-82 17:27:30 PDT (Wednesday) From: Karlton.PA Subject: unknown universal names To: InterScript What should be done with universalIds that occur in a script that are not defined by the Interscript standard? Is the script considered malformed? Should they be treated as standard ids? Should the node in which they occur be considered not understood? PK *start* 00373 00024 USt Date: 11 Aug. 1982 5:59 pm PDT (Wednesday) From: Mitchell.PA Subject: Re: unknown universal names In-reply-to: Karlton's message of 11-Aug-82 17:27:30 PDT (Wednesday) To: Karlton cc: InterScript UniversalIds are like atoms in Lisp and are their own values (yes, ids are first-class values in InterScript). They need not be defined anywhere. Jim *start* 00376 00024 US Date: 26-Aug-82 12:31:19 PDT (Thursday) From: Karlton.PA Subject: naming fonts To: Interscript A proposal to the document modelling group: Name fonts using strings rather than universals. This would make it easier for clients to whatever AI needs to be done to do font substitution. "font.style ← <TimesRoman>" not "font.style ← TIMESROMAN" PK *start* 02288 00024 USt Date: 30-Aug-82 11:20:57 PDT (Monday) From: Ayers.PA Subject: Why "margins" must be defined as "relative" not "absolute" To: Interscript At the Friday meeting, Jim Mitchell and I entertained the others with a debate about relative vs absolute margins -- e.g. paragrpah margins. The debate was sparked and moderated by Phil Karlton. First a summary [mine] of the debate, and then the reasoning for why "margins" must be defined as "relative" not "absolute". JimM: We think of margins as relative and so they should be relative. The paragraph might say something like "leftIndent ← 10", by which it means "10" relative to its parent. BobA: But suppose that the paragraph said, instead, "globalLeft ← globalLeft + 10" This would achieve the same effect, and have the advantage that the implementor of an editor wouldn't need any "new" rules to compute the absolute left-margin-on-the-page. JimM: But the "globalLeft ← globalLeft + 10" obscures the users intent. The user thinks of the margin a relative, a property sheet needs to show him the "10", and its an AI problem to recover the "10" from some arbitrary assignment to "globalLeft" BobA: OK, but lets keep the advantage I mentioned above by having the script say "localIndent ← 10 globalLeft ← globalLeft + localIndent" where the second assignment can be an abbreviation. JimM: Seems convoluted and ugly. Is it saying the the second assignment is mandatory? If so, why have it present? If not, then the localIndent assignment is a fraud of sorts. [Both: At this point we rediscovered that. if we put all the layout issues inside the "Box" node, as we've discussed, much of this discussion is simplified, since the semantics of "globalLeft ← globalLeft + localIndent" and its actual presence in the script, are concentrated in one place, and that "special casing", therefore, is less unpleasant.] After the meeting, I drove up to Yosemite for the weekend. Somewhere around Manteca, I realized the following: Ayers' position in the above debate, namely that assignments to the "globalLeft" are superior because they tell the editor, directly, where the absolute left edge of the paragraph is, is quite bogus. For consider the paragraph split across both columns of a two-column page. Bob *start* 00537 00024 USt Date: 31-Aug-82 16:02:24 PDT (Tuesday) From: Ayers.PA Subject: Forward Compatibility on Node Types To: Interscript I propose that we allow for smooth extensibility of our standard node types. If we name our paragraph node a PARA-0$, then we can, later, define a PARA-1$ which incorporates everything in a PARA-0$ with some new goodies. Then an editor which "understands" a PARA-1$ node can easily know that it also "understands" all PARA-0$ nodes just by considering that they are the same as a PARA-1$. Bob *start* 00398 00024 USt Date: 31 Aug. 1982 5:37 pm PDT (Tuesday) From: Mitchell.PA Subject: Re: Forward Compatibility on Node Types In-reply-to: Ayers' message of 31-Aug-82 16:02:24 PDT (Tuesday) To: Ayers cc: Interscript If you just call them PARA0$ and PARA1$ the current language will even allow it. It sounds like a fine idea (but I immediately have visions of PARA43$. Why is that?). Jim *start* 00601 00024 USt Date: 31-Aug-82 17:55:39 PDT (Tuesday) From: Ayers.PA Subject: Re: Forward Compatibility on Node Types In-reply-to: Mitchell's message of 31 Aug. 1982 5:37 pm PDT (Tuesday) To: Mitchell cc: Interscript Yes, I understand that the language allows it. What I was trying to suggest was that we institutionalize, via the semantics in the standard, that PARA23$ MUST define a strict superset of PARA22$. This would allow editors to implement back-compatibility without a bound-in list of the flavor "NEWPARA$ is a superset of PARA81$ is a superset of PARA$ ..." That's all. Bob *start* 00802 00024 USt Date: 31-Aug-82 17:58:04 PDT (Tuesday) From: Karlton.PA Subject: Re: Forward Compatibility on Node Types In-reply-to: Mitchell's message of 31 Aug. 1982 5:37 pm PDT (Tuesday) To: Mitchell cc: Ayers, Interscript Jim, According to the grammar I got from you last universal ::= ucID ucID ::= upperCase* delimiter upperCase ::= hexChar | "Q" | ... | "Z" hexChar ::= "A" | "B" | ... | "P" This implies that PARA0$ is not legal. (I must admit I have not seen the August document yet.) I am somewhat confused about this discussion. Why isn't the universal interpreted in terms of the version number of the standard? If a later version adds semantics to some node type, why doesn't the editor writer fix up his understanding of that node at the time he updates his tool? PK *start* 01507 00024 USt Date: 31-Aug-82 18:55:18 PDT (Tuesday) From: Ayers.PA Subject: Motivation: Forward Compatibility on Node Types In-reply-to: Karlton's message of 31-Aug-82 17:58:04 PDT (Tuesday) To: Karlton cc: Interscript Suppose that, after the Interscript standard is approved, we see the need to add a new semantic (say "color") to the PARA$. We have three choices: 1. Update to a new INCOMPATIBLE version (Interscript83) of the standard which has "color" in a PARA$. Old editors (that understood Interscript82) would not be able to accept any Interscript83 documents. New editors that wanted to continue to process Interscript82 would contain special compatibility code. 2. Update to a new COMPATIBLE version of the standard by introducing a new node, the PARAX$, which is like a PARA$ but with "color." Old editors can accept new scripts, but will not be able to edit PARAX$s. New Editors will contain special code reminding them that they can, luckily, display and edit PARA$s with the same module that plays with PARAX$s. 3. Arrange things so we can update to a new compatible version of the standard with a PARA-1$, which is like a PARA-0$ but with "color." Old editors can accept new scripts, but will not be able to edit PARA-1$s. New editors can accept and edit both, since they "automatically" "know" that they can treat the PARA-1$ like a PARA-0$. Arrangement three seems the most bountiful. Bob *start* 00847 00024 USt Date: 8-Aug-82 13:39:52 PDT (Sunday) From: Ayers.pa Subject: Node Type that is Subset of more General Node's Functioning To: Horning, Mitchell cc: Interscript I have a defined node type GENERAL$. I have in mind another node type, a pure subset of the capabilities of a GENERAL$. It would be useful if I could, in a script, call out a node of this more restricted type so that it was clear that it was the pure subset; that would let anyone who could handle a GENERAL$ handle it without further investigation. Declaring a { GENERAL$ PURESUBSET$ ... } does something useful, but not the above. Is there anything in the syntax that would let me do the above? [I can, of course, do it in the semantics by just stating, in the Standard, "a PURESUBSET$ node is a strict subset of the functioning of a GENERAL$ node"] Bob *start* 00520 00024 USt Date: 9 Aug. 1982 10:02 am PDT (Monday) From: Horning.pa Subject: Re: Node Type that is Subset of more General Node's Functioning In-reply-to: Ayers' message of 8-Aug-82 13:39:52 PDT (Sunday) To: Ayers cc: Interscript Bob, You can also do it by having PURESUBSET contain GENERAL$ in the quoted expression of its binding. If you want to do this at the level of the standard, then you probably have to state "a PURESUBSET$ node is a strict subset of the functioning of a GENERAL$ node". Jim H. *start* 01105 00024 USt Date: 9-Aug-82 9:23:08 PDT (Monday) From: stepak.pa Subject: Page Format Info in Script To: Interscript.pa cc: , stepak.pa Reply-to: stepak.pa This is a request for ideas concerning the representation of the Star page format information in a script. One alternative might be to create a new node in the script each time a page format character (PFC) is encountered in the Star document: {PARA$ {CONTENT$ <This is text> {PAGE$ pageHeight ← 11 pageLeftMargin ← 2 .........} <More text>} {LAYOUT$ {STYLE$ ...} {BOXES$ ...} }} An issue that we have to be concerned about is the fact that the values defined in the 'PAGE$' node do not take effect until the top of the next physical page (in Star, at least), though the node itself appears in the script where the actual PFC was. If this script was internalized by an editor that had the capability of changing the page margins within a single page we have to think of what we really want to happen. (In Star the page margins are constant for each page...), etc. etc. Any feedback on these issues would be appreciated. Jane *start* 00598 00024 USt Date: 9 Aug. 1982 10:20 am PDT (Monday) From: Horning.pa Subject: Re: Page Format Info in Script In-reply-to: Your message of 9-Aug-82 9:23:08 PDT (Monday) To: stepak cc: Interscript Jane, I think you're on the right track. Star clearly needs to be able to keep this collection of attributes with the PFC, and reinternalize them from there. Star will associate the margin with the "right" page, and other editors with different interpretations of page margins will either not see the node, or will apply their own interpretation. (Cf. discussion of topLeading.) Jim H. *start* 02707 00024 USt Date: 19-Aug-82 13:35:59 PDT (Thursday) From: Karlton.PA Subject: Reference encoding of Interscript To: Interscript.pa I want to make the following proposals for the reference encoding of Interscript 83. I have drawn from the grammar that Mitchell extracted from a document he was working on, the Interpress Standard and the draft of the Interscript 83 standard that Ayers is currently working on. 1) Add the token "ENDSCRIPT" to follow the last "}" of the script. This will make it easier for parsers to do their final reduction and to honor the following from Bob's draft: "Bits which might appear to 'follow' the body are not part of the script." 2) Use the low-horizontal-line character [137C, ISO 646 5/15, "←"] for "leftArrow" assignment. This is the character which prints as a left arrow in most of the Xerox fonts used by PARC and in the development environment. 3) Definition of strings is taken from the Interpress standard (section 2.7, page 25) with the obvious modification that the escape character also has to be represented in side an escape. I am including a slightly revised copy of the paragraph from that document for the benefit of those that don't have a copy. A string is encoded by surrounding the characters with '<' and '>', thus '<Hello!>'. If the string contains an element outside the allowed range (i.e. if the value is less than 32 or greater than 126) or the value 62 (ISO code for '>') or the value 61 (ISO code for '=', the escape charcter) that element must be represented as a number inside an ESCAPE, which is signalled by placing '=' (the escape character) in the string. Parsing returns to normal ONLY if a second '=' is found. Within the escape sequence is a sequence of character numbers: integers followed by a space, or radix 16 pairs. (Note that radix 16 pairs only can yield character numbers in the range [0..255].) The two-character encoding of small integers is designed to make escape sequences compact. Thus '<Hello!>', '<Hel=GMGP=!>', '<Hello=33 >' (note the space), and '<=EIGFGMGMGPCB>' are all equivalent. One issues that I would like to see addressed: Should we restrict character numbers to be in the range [0..65535]? Probably not; but then can we say we have Interscript conforming editors if they will not handle scripts that have characters out of this range. Can we get away with restricting character numbers so that no number that has all 1's in some octet is permitted? I plan on changing my parser to obey the rules I have proposed. I think it would be a good idea to get the reference encoding for Interscript tied down fairly soon, so if you have any counter proposals, please speak up. PK *start* 02751 00024 USt Date: 11-Aug-82 12:48:57 PDT (Wednesday) From: Karlton.PA Subject: defining some identifiers To: Interscript Reply-To: Karlton I have begun using the following names. Hopefully, the document modellers will soon be fleshing out the rest including their semantics. font.face.weight (may get assigned a value from {... | NORMAL | BOLD | ... }) font.face.slant (may get assigned a value from {... | NORMAL | ITALIC | ...}) margin.left left paragraph margin (measured in micas) margin.right right paragraph margin We also need to standardize how box dimensions are measured: are they relative to box's parent's dimensions, etc? I have chosen the following scheme: Imagine one box contained in another. (The following diagram will look best when viewed with a fixed pitch font.) The coordinates given are those relative to some absolute plane. (bx0, by0) (bx1, by0) +-------------------------------------------------------+ | | | B | | | | (ax0, ay0) (ax1, ay0) | | +--------------------------------------+ | | | | | | | | | | | | | | | | | | | A | | | | | | | | | | | | | | | +--------------------------------------+ | | (ax0, ay1) (ax1, ay1) | | | | | | B | | | +-------------------------------------------------------+ (bx0, by1) (bx1, by1) Then the margins of A are left = ax0 - bx0 (measured relative to the left edge of B) right = ax1 - bx0 (measured relative to the left edge of B) top = ay0 - by0 (measured relative to the top edge of B) bottom = ay1 - by0 (measured relative to the top edge of B) The coordinates of the bottom right corner are given relative to the top left corner of the containing box rather than from the bottom right corner of the containing box or the top left corner of the box in question. PK *start* 00345 00024 USt Date: 19-Aug-82 12:08:09 PDT (Thursday) From: Karlton.PA Subject: Reals and Integers in Interscript To: Horning.pa, Mitchell.pa cc: Interscript.pa Why are the literals of type real and integer considered differently in the grammar? Is it that numbers are actually real unless they are being used inside of a string? PK *start* 01180 00024 USt Date: 7-Aug-82 18:57:48 PDT (Saturday) From: Ayers.PA Subject: Boxes and Paragraph Looks To: Interscript I suggest the following: 1. We define boxes and their interaction with galley content so that we can say a. here is a sequence of boxes to be filled by the content b. fill the first, then the second, etc. c. if you run out of content, discard the remaining boxes d. if you run out of boxes, keep generating and using a copy of the last box in the list 2. We specify a paragraph's layout by specifying a paragraph-box and also a set of line-boxes. The paragraph's text content fills the line boxes according to the above arrangement. 3. Now we can specify a simple rectangular paragraph box; the first-line indent is handled naturally by just making the first line-box different. 4. Now we can specify both Bravo-style leading and Star-style line-heights in a natural fashion. Bravo-style is to have a line box which says "height = tall enough for content" and a box-to-box separation of the paragraph leading. Star-style is to have fixed-height line boxes "height = paragraph line height" and no gaps between line boxes. Bob