*start* 00554 00024 USt Date: 1-Feb-82 14:35:52 PST (Monday) From: Ayers.PA Subject: Codification of InterDoc "level zero" and "level one" To: Mitchell, Horning Reply-To: Ayers cc: Ayers How do you feel about the state of the basic InterDoc syntax? My current feeling is that if I just let things happen, the TWG will get a refined document from you that is more-or-less suitable to be copied into a chapter of a full InterDoc proposal. Is this so? Or would you-all appreciate goodly amounts of feedback, discussion, and documentation support? Bob *start* 00661 00024 USt Date: 2 Feb. 1982 10:23 am PST (Tuesday) From: Horning.pa Subject: Re: Codification of InterDoc "level zero" and "level one" In-reply-to: Your message of 1-Feb-82 14:35:52 PST (Monday) To: Ayers cc: Horning, Mitchell Bob, I think that there are several rough edges in the syntax, that we deliberately didn't put much time into polishing, on the grounds that they could be polished indefinitely. I expect that the attempt to produce a more effective exposition of the semantics will lead to a certain amount of conceptual clarification. I.e., it's not too late for feedback, but if none comes, the process will still proceed. Jim H. *start* 02210 00024 USt Date: 8 Feb. 1982 9:11 am PST (Monday) From: Mitchell.PA Subject: Re: Paragraph Properties Discussion In-reply-to: Ayers' message of 28-Jan-82 11:19:15 PST (Thursday) To: Ayers cc: InterDoc I have been ignoring this discussion for the past week because of other responsibilities, but now I can jump back in. This message comes in two parts. The first comments on your two proposals, and the second tries to answer the question you raised. A. Your proposals are 1. The property can be a value directly associated with the para$ node e.g. {para$ ... hangingIndent_6 ...}, or 2. The property can be represented as an imbedded node e.g. {para$ ... {indents$ hanging_6} ...} or even {para$ ... {hangingIndent$ 6} ...} I don't think (2) is a reasonable script because it attempts to associate an attribute with a node above it in the abstract tree (even if you did it this way, "hanging" would have to be declared at the para$ level as, e.g., hanging:0 and then be modified in the indents$ subnode as hanging:=6). I believe that (1) is the method of choice. B. How to avoid having a simple editor having to understand indentation before being allowed to alter the text of the paragraph. First, I agree with Jim H. that the current safety rules specify only that an editor must understand all the marks on a node to be able to edit it safely, not that it has to deal with all the relevant attributes in the best possible way in order to do so. Secondly, it seems to me that a paragraph's relevant attributes ought to specify only those things that have to do with the paragraph as a container (e.g., margins, leading, keeps, etc.) or with node-wide properties of the contents (e.g., default character attributes or tab settings). Then altering the textual contents is independent of understanding, i.e., implementing, these attributes in all their glory. This sort of separation between form and content is important. It has been my unstated goal to put information about form in attributes and about content in content (I hope that last phrase is not without content). The glue that holds form and content together is the mark(s) on a node. Comments appreciated. Jim M. *start* 00484 00024 USt Date: 8 Feb. 1982 9:21 am PST (Monday) From: Mitchell.PA Subject: Re: InterDoc script vs Printing directions In-reply-to: Ayers' message of 25-Jan-82 15:06:17 PST (Monday) To: Ayers cc: InterDoc At first blush I don't see why all your suggestions of rendering information should not be specifiable in an Interdoc script. Whether or not we want to bind such things as 'printedby' name that early is a separate question (but I hope the answer is no). Jim M. *start* 00966 00024 USt Date: 8-Feb-82 15:43:32 PST (Monday) From: Ayers.PA Subject: Re: Paragraph Properties Discussion In-reply-to: Mitchell's message of 8 Feb. 1982 9:11 am PST (Monday) To: Mitchell cc: InterDoc You say that "it seems to me that a paragraph's relevant attributes ought to specify only those things that have to do with the paragraph as a container (e.g., margins, leading, keeps, etc.) or with node-wide properties of the contents (e.g., default character attributes or tab settings). Then altering the textual contents is independent of understanding, i.e., implementing, these attributes in all their glory." And I totally agree with your statement. But (.. there's always a 'but') Phrases like "things that have to do with the paragraph as a container" are all perfectly reasonable, but they certainly are not precise in any computer-science sense. Can I get Mitchell/Horning to say what an attribute is in computer-science-ese? Bob *start* 00993 00024 USt Date: 15-Feb-82 18:54:04 PST (Monday) From: Ayers.PA Subject: Re: First cut at defining some marks and attributes In-reply-to: Mitchell's message of 15 Feb. 1982 6:04 pm PST (Monday) To: Mitchell cc: Interdoc I appreciate your jumping in with a first cut. Let's all keep the ball rolling. Two immediate-feedback comments: 1. I do not undestand the need for "relative" margins. I thought that we could say "margin.left _ margin.left + 10" and that this facility in the language was precisely suited for things like indents. 2. I don't follow the tab design. It is unclear to me what the 'records' are. The simple view would be that there is a sequence of tab-records within the paragraph [that is attributes bound at this time] and that the presence of an ASCII "9" imbedded in text [e.g. 97 98 9 99 100] triggers the tabbing action. Is this what you intend? (I have been wondering whether the model of the ASCII "9" within the text is appropriate or not.) *start* 03100 00024 USt Date: 16 Feb. 1982 10:42 am PST (Tuesday) From: Mitchell.PA Subject: Re: First cut at defining some marks and attributes In-reply-to: Ayers' message of 15-Feb-82 18:54:04 PST (Monday) To: Ayers cc: Mitchell, Interdoc 1. I do not undestand the need for "relative" margins. I thought that we could say "margin.left _ margin.left + 10" and that this facility in the language was precisely suited for things like indents. Ans: You are right that we can say something like margins.left_+10, which would increment it by 10 units. The reason for making "relative" (oh for a font or two more in Laurel!) a part of margins is to allow an editor such as Star, which does present them to users as relative, to make it explicit that they be kept that way. We could specify margins.relative as being False if it is not even a component of the margins value, thus making the absolute case somewhat simpler, but I expect that every conforming editor would still need to take care of the case when it is defined, so nothing would be gained. 2. I don't follow the tab design. It is unclear to me what the 'records' are. The simple view would be that there is a sequence of tab-records within the paragraph [that is attributes bound at this time] and that the presence of an ASCII "9" imbedded in text [e.g. 97 98 9 99 100] triggers the tabbing action. Is this what you intend? (I have been wondering whether the model of the ASCII "9" within the text is appropriate or not.) Ans: A record is a value which is a self-contained environment. By "self-contained" I mean that its Outer component has the value NIL, unlike the environments associated with nodes in the dominant hierarchy where Outer is (semantically at least) a copy of the environment of a node's parent. It is Outer that provides the means whereby an identifier that has no binding in the environment of a given node, N, is looked up successively in the environments attached to the nodes on the path to the root of the tree from N. Interdoc uses environment values as Mesa does records. You can think of a definition such as tab = [ | position_0 alignment_FlushLeft -- IN (FlushLeft Centered FlushRight Decimal) ] as the definition of a prototype, which can be used in binding other values as a starting point; e.g., tab1 _ [tab| position_10] binds an environment value to tab1; that value has two components, "position" with value 10 and "alignment" with value FlushLeft. Thus, one could bind to a component, ParaTabs, a sequence of such values, e.g., ParaTabs_([tab|position_10] [tab|position_20] [tab|position_30]). I don't have an opinion as to how one should indicate where tabs go in some text. Your convention seems fine to me and consistent with how lower-capability editors most probably view them. We should probably make this specific by discussing it somewhere in association with TEXT. By the way, my strawman design doesn't provide for the kind of tabs that are just spaced at equal intervals, nor for the kind that behave as if they were just multiple spaces. Jim M. *start* 01568 00024 USt Date: 16 Feb. 1982 4:42 pm PST (Tuesday) From: Horning.pa Subject: Re: First cut at defining some marks and attributes In-reply-to: Mitchell's message of 16 Feb. 1982 10:42 am PST (Tuesday) To: Mitchell cc: Interdoc Jim, Most of your suggestions looked pretty reasonable to me. I may be even more confused than Bob about "relative." I had much the same reaction he did, and your clarification didn't help. Maybe the following is a part of the trouble. I have been assuming that margin values were "attributes," to be bound in environments, not values to be included in contents. If that is the case, I don't know what you mean by The reason for making "relative" a part of margins is to allow an editor such as Star, which does present them to users as relative, to make it explicit that they be kept that way. I also have some trouble knowing how tabs should REALLY be represented. There seem to be too many interesting and useful cases. Of course, every editor has already made some decision, but what is Star to make of tabs in an 860 document, and conversely? Some tabs, it seems clear, should be treated as characters in the text (move to the next tab stop that follows the position of the preceding character) and others should be treated as containing the position (put the next character in column 5). These are really very different concepts, now that I come to think of it. Maybe we have let the use of a single input key for both concepts mislead us. Suppose the standard defined both and gave them different names? Jim H. *start* 10664 00024 USt Date: 17 Feb. 1982 11:52 am EST (Wednesday) From: Zack.wbst Subject: Font Standards Discussion To: PrintStds^.es, ComStandards^.wbst, NetStandards^.wbst, DocStandards^.pa, IPdesign^.pa, TelepressDesign^.es, InterDoc.pa, Reilly.es, Ellis.pa, MMartin.es, Hamerly.wbst, Leibowitz.wbst, Wayman.wbst, Marshall.wbst, Damouth.wbst, Baroody.wbst, Harrington.wbst, (Mike Townsend A2-26)ESMail.es, FontStandards^.wbst Reply-To: Zack A new distribution list named FontStandards^.wbst has been created for handling discussion of font name, metric and format standards. If you care to participate, add yourself using Maintain. If you are not interested in font standards, read no further. This is a long message. The remainder of this message is to solicit comments on some font naming issues relevent to a proposal I am preparing for the Print Standards Subcommittee. The proposal deals with font name and metric attributes, and is an extended version of an earlier proposal which dealt with metrics only. That proposal, dated January 19, 1982, was named Font and Character Metrics and is stored on [ERIE]FontMetricsProposal.press. A font name is, by definition, a unique identifier for a font, but the kind of information contained in a name depends on its usage. The name must contain enough information to distinguish a font from all others in the same environment, but does not necessarily carry information describing the font or how to use it. Such information resides in font metrics. Both the name and the metrics are attributes of a font. There seem to be six distinct (?) places in which font names are important. They are: 1. for type design and classical typography. A name like "Bodoni Book Bold Condensed Outline" uniquely specifies a typeface, and, along with point size, uniquely specifies the font. Note that "Bold" and "Condensed" are simply part of the typeface designation, and convey only qualitative information about weight and setwidth. For example, Bodoni Light may be "heavier" than Bodoni Book Bold. 2. for digital font library maintenance. Information such as rotation, target printer name or resolution, revision level or date, character set and customer name must be provided as well as typeface and size. 3. for document creator use. Typeface, orientation (rotation) and size are needed. 4. for inclusion in an Interpress master. Only the typeface is needed (size and rotation is set by the current transformation). Different typefaces may be required depending on the angular subtense or viewing size (e.g. title vs body vs footnote), however. 5. for use within an Interpress printer, particularly for font substitution. An Interpress master asks for a font by typeface, with size and rotation specifed by the current transformation. From this the Interpress interpreter must select an appropriate font. There may be one font available for each point size and rotation, or one font may be scalable to serve for several sizes and rotations. The identification of the proper font must involve accessing the easyTransformation vector of the fontMetrics vector for each candidate font. This is a case in which the user supplies insufficient information to find the right font on the basis of name alone. A reasonable way to handle this is to build a lookup table into the printing software mapping an Interpress-style font name plus a transformation specification into a specific predetermined internal font name. The internal name need not include anything but a pointer to a file. 6. for access via Clearinghouse. If a printer needs to look to the net for a font it needs but does not have on hand, it must specify typeface, size, orientation and the model printer or at least its resolution, and perhaps character set. A serious problem is that at least the typeface, size and orientation must all be encoded in 40 characters. Clearly typograhic names such as in (1) above are not suitable. Moreover, the size specification may have to be a range of sizes. There are several possible solutions, however: A. The typeface name could be broken up in a predetermined way as in current 5700 and Alto-world software (Family name plus MRR = medium roman regular, BIC = bold italic condensed, etc.). This throws away some flexibility, requiring that Oracle Bold Italic be identical to Oracle Italic Bold, for example. It also makes handling special descriptors like "Extrabold, "Outline" and "Shadow" difficult, and may require abbreviation of long names. B. The forty characters could be treated as a 320-bit number and a unique number could be assigned to every font as it is created. This may not be as silly as it sounds. Font comparison and substitution would be impractical, but for simple lookup this scheme offers plenty of unique identifiers. C. The first 35 characters or so could be used to specify the principal (family) name of the typeface, such as Helvetica. The 40 bits of the remaining five characters could be used to specify common secondary words such as Bold, Outline, Demi-bold, Display, etc. These terms would have to be independent (although not all combinations would make sense), so only 40 could be accomodated in this way. Any others would have to be included in the 35-character principal name. Of course, if 48 common terms were identified, then 6 characters could be reserved for this use. As in (A), the order of terms is lost. D. Numbers could be assigned to style, weight, slant and setwidth parameters and encoded somehow in the name. This has been proposed to facilitate font substitution. My opinion is that assigning these numbers would be too arbitrary and ambiguous to be useful. Font substitution is discussed in more detail below. E., F. -- Other alternatives? Votes? Now, for a standard it would be nice if there were a single uniform name that we could apply to all transactions involving a font, but seems difficult in view of the differences among 1 (Typography), 2 (Library), 4 (Interpress) and 6 (Clearinghouse). Probably 3 and 4 can be considered equivalent. Likewise for 5 and 6, especially if the printer model name is encoded in a different part of the Clearinghouse name. It is probable that at Xerox we won't need to handle type-1-named fonts electronically (as soon as a font is digitized, it is given a type-2 name). Thus we are left with three distinct types of names: Library (2), Interpress (4) and Clearinghouse (6). These three types of names differ mainly in the amount of auxilliary information they contain in addition to the Typeface identifier (Library names contain all sorts of things, Clearinghouse names contain size, rotation and printer name, and Interpress names contain nothing extra). Now -- for a working proposal, suppose we move all this auxialliary name information into the fontMetrics vector and rename it fontAttributes. Specifically, the new entries in fontAttributes would be: a. Typographical name -- the full name as in (1) above. b. Orientation(s) (rotation) c. Viewing size(s) -- equivalent point size at 40 cm viewing distance d. Revision level, date, history, etc. e. Customer name (if applicable) f. Character set name (if applicable) -- redundant with codesNS (see Metrics proposal) g. Security key(s) (if applicable) h. Target printer model or models i. Resolution in X direction (perhaps in lieu of printer identification) j. Resolution in Y direction The information in b, c, i, and j is partly redundant with easyTransformations, and may not need to be listed here separately. The font name (hereafter referred to as Network name) could then be reduced for Library, Interpress and Clearinghouse use to something like 6 - A, B, C, or D above. Note that the Interpress name would then be only a string instead of a heirarchical name vector as presented in Interpress 82. Now, FONT SUBSTITUTION. With such information moved out of the fontName and into fontAttributes, a font substitution algorithm must be allowed to access fontAttributes. The important attributes are: i) Network name, particularly if key words like Bold and Italic are separately coded as in 6 - A or C above ii) Typographical name iii) Orientation iv) Size v) Character set vi) DIN style classification -- This is discussed in Font and Character Metrics. My opinion is that it is much more robust than simple classifiers of ornateness or emphasis or serif-ness. More important, it was established by and is accepted by the real typographers and type designers of the world. A simple substitution algorithm based on these parameters ought to be provided at most printers. The 5700 algorithm based on rotation, size, posture, weight, setwidth and family name may be a reasonable starting point. We need to add a check on whether the desired character(s) are present, a search for similar styles based on the DIN classification and a check for special name parts like Antique and Shadow. I think we should also allow the Interpress master to supply its own substitution algorithm. This means the master must have access to the list of available fonts, both local and via Clearinghouse. It would also be nice to allow the master to contain fonts. Last topic -- FONT SECURITY. There are at least two types of security problems. The first is copying of our fonts. The simplest way I have heard about to prevent this is to include in each font file the network name or address of the printers(s) allowed to use the font and then check this in the print software. Of course this is only a little protection, but it makes the effort required to steal a font significantly greater than it is now. The second type of font security we should think about is use of particular fonts, such as signatures, by unathorized users. My opinion is that this is best handled by print software. This is the end of my comments. What is missing? What won't work? What is OK? How well does the 5700 font substitution algorithm work? Is it feasible to supply fonts from Clearinghouse and/or from Interpress masters? What is the right way to handle size and orientation information, especially in view of the Interpress easyTransformations? How should viewingSize be encoded? Can we really name all character sets? Is codesNS reasonable? Does fontAttributes contain enough information for library control? Do we need to erase some of the fontAttributes in the fonts we send to customers? How about font security? For those of you who have read this far, don't forget about FontStandards^.wbst. -- Greg *start* 00380 00024 USt Date: 17-Feb-82 15:20:32 PST (Wednesday) From: Ayers.PA Subject: On Environments and Records and ... To: InterDoc Reply-To: Ayers After the recent conversation between Mitchell/Ayers/Horning, I feel that we could use a little chalk-talk from Jim Mitchell on how he views environments, attributes and the like. Anyone agree? Jim, are you available? Bob *start* 00475 00024 US Date: 17 Feb. 1982 4:34 pm PST (Wednesday) From: Mitchell.PA Subject: Re: On Environments and Records and ... In-reply-to: Your message of 17-Feb-82 15:20:32 PST (Wednesday) To: Ayers cc: Mitchell, Karlton, Horning I am available. Some of the current confusion may well be overcome by the palatable semantics stuff I am working on (and which Phil asked about). In any case, if you would like some chalk-talk about this stuff, name a time. Jim M. *start* 07660 00024 USt Date: 15 Feb. 1982 6:04 pm PST (Monday) From: Mitchell.PA Subject: First cut at defining some marks and attributes To: Interdoc.PA The following is my first try at an answer to that age-old question, "What is a paragraph?" It is only half, not fully baked, but it might be a reasonable place to begin. Your comments and questions are earnestly solicited. Jim M. ------------------------------------------------------------------------ Conventions used: material in  brackets are comments or questions. the term record is used to describe a value which is an environment with Outer=NIL ------------------------------------------------------------------------ Primitive nodes: A string value is syntactic sugar for a node of the form {CHAR$ c0 c1 ... cN} where the ith character (i in [0..N]) of the original string constant has been mapped into a number, ci. Examples: is the same as {CHAR$ 97 98 99 100 101 102 103} ------------------------------------------------------------------------ Text nodes: Mark: TEXT Contents: the evaluated contents of a TEXT node can only be a sequence of strings of text (or, equivalently, a sequence of CHAR nodes). It cannot contain any other kind of nested subnodes. Examples: {TEXT$ } {TEXT$