ThesisOutline.Tioga
Last edited by Rick Beach, December 26, 1984 9:56:16 am PST
Setting Tables and
Illustrations with Style
CONTENTS
1 Introduction
¶2 Document Composition
¶3 Graphical Style
¶4 Tabular Composition
¶5 New Framework for Tabular Composition
¶6 Future Directions
¶7 Glossary
¶8 References
¶9 Index
--
1 Introduction to Document Composition
1 INTRODUCTION TO DOCUMENT COMPOSITION
CONTENTS
¶1.1 What this thesis contributes?
¶1.1.1 What is a document?
¶1.1.2 Difficult Problems in Document Production
¶1.1.3 Concept of Document Style
¶1.1.4 Graphical Style and Table Formatting
¶1.2 Roadmap to this thesis
--
¶1.1 What this thesis contributes?
this thesis deals with the problems of complex document formatting
electronic environment for composing, formatting, and producing documents
problems in particular with illustrations and tables
refines the concept of style to manage the complexities of this task
define what I mean by style early
¶1.1.1 What is a document?
a document communicates information; author collects the information and organizes its presentation; various relationships with the reader; documents that inform, that persuade, that argue, that entertain
historically: documents were written works, produced in handwritten form by scribes, nonmovable type?, early printing with movable type; lots of illustration and ornamentation; hand-drawn or engraved images (i.e. illustrations haven't been mechanized)
traditionally in the Graphic Arts: document are manuscripts that contain all the text, illustrations, reference material; process is designed for a writer to guide the reader in his exploration of knowledge; tables of contents and indices are the major means for reader to explore on his own; electronic tools to produce documents (this is the area this thesis explores)
electronic document: a corpus of knowledge, communication by words and images, technology exists for the reader to guide his own exploration of knowledge; active documents that respond to queries posed by the reader; fulfilling the Engelbart vision of augmented human intellect (this is the area future research will realize) [Engelbart, Nelson, van Dam]
all documents require production and presentation of information; textual, mathematical, tabular, illustrative material; this thesis concentrates on tools that simplify the complexity of the document presentation problem which ensuring a full gamut of choices; assumes an integrated environment
this thesis surveys document composition techniques:
both traditional graphic arts techniques and electronic composition tools
to better understand the traditional process and where difficulties remain
to determine where we are in solving the problems in document composition
¶1.1.2 Difficult Problems Remain in Document Production
some difficult problems remain in document production
from personal experience of the author producing scholarly books and articles
these problems exist despite the presence and use of electronic aids
good solutions exist
for collecting and editing text, even large volumes of text
for ensuring accuracy when incorporating computer programs or numeric data
typographers are unfamiliar with monospaced fonts with special computer symbols
especially the treatment of quote marks: opening and closing quotes may be gratuitously inserted instead of the ASCII quote mark
treatment of white space in a monospaced font may be crucial; wordspaces are typically more narrow than a monospaced letter width
accuracy in small details is crucial to the successful compilation and
for ensuring some editorial consistency through global search and replace techniques
for example reference citations, the UNIX refer package, various macro packages
for handling editorial revisions of drafts or subsequent editions (one book on introductory computing went from WATFIV-S to PASCAL to FORTRAN77 using the same manuscript files)
for handling families of documents with similar design and layout
for typography of text, using typographic fonts available on graphic arts typesetters; we know how to hyphenate if sufficient resources available
for including some forms of math, illustrations and tables (with considerable work and it is not always easy or possible), and other notation systems, for example Benesh [reference to Benesh?]
difficulties remain
mathematical notation is still hard
lack of WYSIWYG, eqn is close but loses the math content because of this, lack of interactive editing tools in many systems forces manuscript author/editor to use an abstract language; always need to accommodate new notations invented by authors; cannot incorporate math uniformly in all places in a document due to lack of uniform document structure (stream oriented processors require us to save math as a stream): tables of math, section headings, table of contents, index;
tables are awkward
each table tends to be designed as a unique formatting problem; document style does not apply to tables as easily as for text style; many special requirements not easily handled; large volume tables are awkward or consume significant resources; simple tables are straightforward: spreadsheets or mathematical tables
illustrations either done by hand or crudely by computer
quality of graphic arts standards rarely met; wide range of illustration requirements: sketches, engineering drawings, design art, shaded or continuous tone images, graphical highlights (lines on text or math)
lack of uniform techniques for integrating these into documents, mainly because of an inadequate document structure (presently most often it is a text stream representation and the representations are not recursive to permit math in illustrations in tables in all possible permutations)
pagination solved for simple cases and generic text documents; difficult to accommodate perturbed solutions for special cases; footnotes still awkward; placement of figures awkward
batch processing and the "pregnant pause"
reproduction quality output is delayed until the right moment when everything is just right
draft cyles reprocess much of what has not changed (thereby wasting resources and introducing delays in the process) this is necessary to get the cross references, index entries, page numbers, etc. for the whole collection of text being processed [Reid, Brader]
drafts are better quality than with traditional production techniques yet yield no confidence that the final output will eventually appear (evidence the need to comfort wary publishers hoping that you will meet their deadline)
reliance on adequate quality output devices (laser printers or typesetters) with sufficient capacity to handle the completed chapter or document in toto; surge of load when the document is about to deliver; lack of parallelism in the process since sequential order of formatting and output
yet we still need cut and paste at the last minute; illustrations we can not process by output device (photographs or reductions); corrections after the main processing cycles are complete [Brader]; how to feed the corrections automatically into any future versions of the document
¶1.1.3 Concept of Document Style
observe where we have been successful with electronic aids for document production
model - streams of ASCII characters and editors, e.g. runoff et al.
control - separating form from content
integration - tools using the same model
the concept of style controls and specifies a mechanism to address the complexity of document production
styles deal with the issues of appearance, aesthetics, and understandability in documents
"The processes of book design may be classified as editorial planning (in which the text may be locally re-arranged if necessary in preparation for composition, for the benefit of author and reader), visual planning (which determines the appearance of the printed image), and technical planning (which is concerned with the structure of the book and the methods of its manufacture). The editorial and visual aspects of the design derive most of their effectiveness from technical planning. Success in one process or in one aspect alone is never enough; failure at one is more than enough." [Hugh Williamson, Methods of Book Design, p 353]
these are the issues of the efficient communication of information
"To lay down rules of style would be easy enough — we need only consider how things were done yesterday, or how they are done today, or how we prefer to do them ourselves, and to elevate these practices or preferences to the status of dogma" [Hugh Williamson, Methods of Book Design, p 2]
we are interested in furthering the efficiency of that communication:
"The practice of typography, if it be followed faithfully, is hard work — full of detail, full of petty restrictions, full of drudgery, and not greatly rewarded as men now count rewards. There are times when we need to bring to it all the history and art and feeling that we have, to make it bearable. But in the light of history, and of art, and of knowledge, and of man's achievement, it is as interesting a work as exists — a broad and humanizing employment which can indeed be followed merely as a trade, but which if perfected into an art, or even broadened into a profession, will perpetually open new horizons to our eyes and new opportunities to our hands." [Daniel Berkeley Updike, Printing Types, 1922, quoted by Hugh Williamson, Methods of Book Design, p 4]
the concept of style will be central in this thesis to extending the control on the appearance of illustrations and tables
¶1.1.4 Graphical Style and Table Formatting
observe that the table formatting problem extends down to math and up to page layout
eqn uses boxes model
provides a concentrated focus of the issues
two-dimensional layout
alignment
many degrees of freedom for typographic parameters
the thesis presents an experiment in providing style for illustrations and the results
presents an experiment with table layout using an underlying grid system and a constraint satisfaction mechanism
these grid systems are the same as those used by graphic designers to layout book designs
grids provide a discipline for creative expression; sort of like bringing "order out of chaos"
outlines further research in interactive document formatting
--
¶1.2 Roadmap to this thesis
The plan
review the state of computer composition systems
outline some remaining difficulties of mathematics, tables, and illustrations
attack the problem of style in illustrations
observe that layout remains in both illustrations and tables
attack the table formatting problem
suggest some future directions for this research
Chapter ¶2 Document Composition
present a survey of traditional graphic arts production process
how books get published
roles involved in producing a book
who has done or built typesetting systems before
early typesetting systems
document compilers
integrated document systems
document models that reflect these purposes
document compilers accept streams of codes and manuscript text
integrated document systems require structured documents
object-oriented programming styles and structures that apply to document models
we are left with questions:
how to control the complexity of document production?
how to tackle the difficult problem of laying out 2-dimensional information
style mechanisms are present in the systems described so far
they abstract the specification of typographic detail
how far can we push this idea of style?
Chapter ¶3 Graphical Style
extend style mechanism to illustrations
same separation of form vs content
present the material published in "Graphical Style ...", Computer Graphics, v16, n3, 1983
however, its not sufficient
we need contraints to make the system useful
observe that constraints deal with the layout problem
where is there a concentrated layout problem we could tackle?
Chapter ¶4 Tabular Composition
this is a hard problem
show early solutions and their limitations
no style, few typographic features
identify and survey lots of typographic features required for tables
perhaps prioritize these features
examine current solutions in tbl, TEX, Star, and some new integrated document systems
Chapter ¶5 New Framework for Tabular Composition
introduce the notions of grid design systems
graphic designers have used them for years
typesetting systems use the dimensions they provide
tbl topology based on such an implicit grid
introduce the use of a constraint solver for tables
demonstrate the prototype for laying out complex tables
is this enough: do I need to also solve the table breaking problem? Kelly's comment: "no, but mention it" and "how hard is it? complexity results?"
Chapter ¶6 Future Directions
recall the observation that tables are page layout problems in minature
show how the page layout problems are attacked by these techniques
make the observation that math notation is a table in minature
show how math notation would benefit from constraint satisfaction applied to the layout
propose just such a math typography project for Tioga; explain how this would incorporate symbolic algebra system like VAXIMA or MAPLE
Chapter ¶7 Glossary
provide this glossary to explain the terms used by typographic specialists when producing documents
assumes a computer scientist background, so it won't include CS terms
--
¶2 Document Composition
2 DOCUMENT COMPOSITION
CONTENTS
¶2.1 Survey of Document Production Techniques
¶2.1.1 Traditional graphic arts techniques
¶2.1.2 How do books get produced?
¶2.1.3 Roles involved in producing a book
¶2.2 Concept of Style
¶2.3 Early typesetting systems that used computers
¶2.4 Document compilers
¶2.4.1 troff
¶2.4.2 Scribe
¶2.4.3 TEX
¶2.5 Integrated composition systems
Etude (became Interleaf)
Janus (became IBM product)
Star
Tioga
WYSIWYG or WYSIAlmostWYG or WYSIAllThatYG (anything else is too hard)
How to control all this complexity?
¶2.6 Document models and views of documents (mumble ...)
¶2.7 How to control complexity?
--
¶2.1 Survey of Document Production Techniques
this section will survey how books get produced
describe briefly the traditional graphic arts techniques
outline the roles involved in producing a scholarly book
write this section for a computer scientist, explain why its appropriate to pause and reflect on traditional roles
¶2.1.1 How do books get produced?
an interesting project is documented in One Book/Five Ways
five university presses compared publishing procedures
University of Chicago Press, MIT Press, University of North Carolina, University of Texas Press, and University of Toronto Press
Chicago Manual of Style
MIT editorial style guidelines
editing style
copy editing
design dummies
page samples
reference One Book/Five Ways, and Scholarly Publishing
parallelism in the process
manuscript from the author gets split (copied): production editor, copy editor, designer (for layout and illustrations)
What are the difficult parts?
e.g. tables, math, illustrations, administration, back matter (index, reference appendix), front matter (tables of contents, title pages)
"Mathematics is known in the trade as difficult, or penalty, copy because it is slower, more difficult and more expensive to set in type than any other kind of copy normally occurring in books and journals." [Manual of Style, Chicago, 1969, p 295] [something similar appears in the EQN paper]
"A good composing room can translate almost any tablular copy in a reasonably clear and presentable example of tabular composition" [Williamson, Methods of Book Design, p 160]
How do these parts get produced easily?
skill of the practitioner
procedural steps among specialists
How do designers specify and control this process?
design guidelines communicate between designer and compositor
rules for authors
reference various style books: Manual of Style, AMS guide to authors, APA, McGraw-Hill, Handbook to Scholars, Words into Type
Include at least one horror story from the GPO as an example of how terrible it can be . . .
--
¶2.1.2 Traditional graphic arts techniques
craft skills in the traditional graphic arts; learned by apprenticeship
worked with metal, subtle lines, small sizes,
skills of the copy editor who marked up the manuscript:
identify consistent parts of the manuscript;
enforced "house rules" for writing;
established the "logical" to "physical" mapping for the manuscript
skills of the compositor who set the type:
mechanics of hot metal;
the many adjustments possible with furniture, spacebands, pi sorts;
who deals with mathematics, table setting, accented letters, find more neat things the compositor used;
the goal of aesthetics and the absence of economics
take some excerpts from Phillips tome
find some "History of Type" references
find "Goodbye Gutenberg" reference, perhaps AVFilm at watdcs will know
--
¶2.1.3 Roles involved in producing a book
Anthropomorphism benefits an understanding of the the process
reference Dyment paper, Booth & Gentleman paper on anthropomorphic design, paint paper on design of interactive paint system
defining roles provides understanding of the problem domain
design of computer roles have to accomplish at least those functions, although electronic or automated environment may provide "insurmountable opportunities"
should tools be described here or somewhere else?
would a picture make this any clearer? or more confusing?
author of the manuscript
creates original manuscript and artwork sketches
cycles manuscript to typist for drafts
submits manuscript to acquisitions editor or journal editor for publication decision
deals with production editor for publication process
makes use of writers workbench [reference?] for editorial changes, structured editor for moving text around
may provide electronic copy of the document to publisher
typist
prepares draft manuscript for author using a typewriter or word processor
acquisition editor or journal editor
acquires new manuscripts from authors, solicits opinions of reviewers to determine if the manuscript should be published
reviewer or referee
provides opinions to the editor; market review; comprehensive review
reviews content, not usually format or style
reviewer for book publisher; referee for journal article
may use typeset quality (but ephemeral) to capture the complex notation
may use electronic means to transport report
production editor
supervises the production process that turns a manuscript into a published form
deals with author about the manuscript; deals with variety of suppliers
may use database to track parts of the process (process control)
graphic designer
1) provides design scheme for the entire book after manuscript is available
2) provides custom designs for difficult situations not covered in the overall book design, such as specific tables, illustrations, or difficult portions of the manuscript
copy editor
1) ensures manuscript meets publisher's "house style" for grammar, spelling, citations, references, illustrations, tables, headings, lists of items, etc. ad nauseum
2) marks up manuscript into logical parts for coder and keyboarding
can make good use of global edits to enforce styles
e.g. enforcing elided digits (900-1, 1066-68, 912-13, etc)
lexicons for foreign language words
searching for cross references; split views; indexing tools
indexer
prepares index entries, sorts them, and prepares index manuscript
frequently not the author, although author may revise index terms
works from proofs of the composed pages to determine the page numbers for each index entry
late in the stage traditionally
electronically can capture location of the references
index tool complexities [reference Winograd and Paxton's Tugboat article]
illustrator, draftsman, graphic artist
creates original artwork from the author's ideas or sketches
draftsman may trace sketches to produce final artwork
graphic artist may do more fine art
illustrator may do more technical art
electronic tools for business graphics; requirements to accommodate creativity; inclusion of engineering drawings annotated
graphic designer may do this job, or establish guidelines on size, reduction factors, typography, shading textures, etc.
keyboarder, coder, inputter
transcribes the marked up original manuscript to create a machine-readable manuscript that can be automatically composed or typeset
typesetter / compositor
takes the machine-readable version of the manuscript and produces type
paste-up artist
takes the type, mechanical art, photographs, etc., cuts out the parts that are to be placed on each page, and pastes them on page forms
graphic designer may do some or all of this job, especially if the manuscript requires frequent design decisions
stripper
produces film negatives from the page forms
strips in negatives of photographs and line art if special photographic processing was necessary for them
strips in pages into imposition format for printing plates, such as printing two-up, four-up, or eight-up
produces device dependent pages in printable format
similar goal of the device independent imaging models Interscript [reference] and PostScript [Seybold report? Alice has ordered copy]
printer
makes plates from the negatives and runs the printing press
several passes from separate plates if color
color separations from outside source working from slide transparency
binder
takes printed pages, folds them if larger than a single page, collates the folds into sequence, and binds them into a cover designed by graphic designer or jacket designer
--
¶2.2 Concept of Style
style: a dictionary definition
style is a way to communicate between the stages in the above processes
what a designer communicates to the illustrator [Scientific American does it iteratively and subjectively]; or designer to a compositor; or a copy editor to a keyboarder
what is traditional style?
style manuals
publisher's "house style"
what makes a book look like something from a particular publisher?
university press style manuals [Oxford, Chicago, Toronto]
style is a way of doing things
a system
procedural guidelines
disciplined usage
sources of style
graphic designer
provides the logical structure specified in terms of physical composition parameters
publisher
guidelines to authors
sample pages from compositor
compositor
composition craftsmen determine final parameters according to iterative learning process (sic!) with publisher critiquing the results
editor
style applies to grammar, numbering, cross refernces, illustration content
Difference between preparer/author and designer
prepare using style system or guidelines
design provides disciplined guidelines, an identity scheme, consistency
with sufficient flexibility to handle the range of design problems in a document: illustrations, textual things, etc.
style affect both substance and appearance
separation [Scribe] of typographic design is only slice of style
editing grammar usage is a style issue, Canadian vs American spelling is a style issue; both affect substance
can I use the lexical, syntactic and semantic notions to describe the various style attributes? Kelly's comment: when this is properly done, the substance is not affected, i.e. it remains the same, but in a different form.
arrangement of authors names in a bibliography: is this an appearance issue only?
perhaps I need to be much more precise in thinking about what style is and how to describe it...
Media specific styles
contrary to a simple notion of "view", or device-independence [Shaw]
device and media affect readability, which should imply a different style [this relates to Cargill's notion of views]
Style rules on tables
many parameters supplied by compositor
table entries often broken by compositor
design guidelines on visual highlights
e.g. rules or extra vertical space every 5 rows, separating digits in math tables
Provide some glossary terms. Refer to glossary appended to thesis.
Example of lack of style: TOG reviewer complaint about "Artificial Intelligence" citation two places in text and in caption both appearing with different typography — so much for consistent style!
--
¶2.3 Early typesetting systems that used computers
Graphic arts heritage
typographic coding represented the manual actions of a typographer
TTS coding implied a stream of codes and text intermingled [Phillips, Berg, Seybold]
Runoff macro processor
reference review given in [Brader], An Incremental Formatter
another review in [Futura, Schofield and Shaw] Computing Surveys article
coded manuscript as a stream of characters parsed by a macro processor
macro library applied when called out
conditional execution and state enquiry provides lots of flexibility [Brader]
WYSIWYG
concrete presentation of appearance [Englebart]
structured documents, objects of several types including graphics
need to see the style and structure explicitly
Note that tables appeared very early [Barnett, NBS]
computers were generating numeric data presented in tabular form
paper tape and mag tape provided computer to typesetter connection
Pagination
macros, compilers like Page-1 [Pierson]
my pagination with TTS coded macro processors, like Proff [Beach]
my pagination with Typeset macros, using a more programming language model [Beach]
early pagination with troff ?
problems with exceptions; Penta's try tables; Seybold's photocomposition tome contains references
--
¶2.4 Document compilers
Compilation implies massive connotations
monolithic documents, document hierarchies, compilers
high level to machine language, debugging, optimization implies approximation
experts in document compilation
refer to years of experience in traditional
Electronic cut and paste still necessary
timeliness of last minute revisions
corrections to the algorithmic results (widows, hyphenations, rivers, things too costly to program)
propagating changes to whole document, e.g. change in chapter title implies contents and running heads
remembering changes if made at different stages (remembering octal patches)
Problem not solved, going bite part of it again
why is the problem not easy?
solutions do not accommodate all parts of a document such as illustrations and tables [reference Scribe thesis]
lack of integraton means that different tools or languages to code different things
Comment on Style, Math, Illustrations, Tables, Page Layout
coded manuscript implies document compilers
separation of function: design vs content editing
leverage from multiple uses of contents
reference Bell Labs early experience with troff: 3 forms of document as internal memo, tech report, and journal article. Kelly's comment: this is a notion of style, of course!
JACM copyeditor felt threatened when Bell Labs was trying to prepare camera-ready pages. First samples came back all covered with red marks. The copyeditor retired shortly thereafter and the situation improved. Mike Lesk built several macro packages for JACM, CACM, and conference paper formats. Papers were often published three times as an internal memorandum, an technical report and then in a journal. Noted the project for American Physical Review Letters. [Steve Johnson, Bell Labs]
--
¶2.4.1 troff
Document formatting machine language
Style mechanism
Macro packages provide style
slide different macro definitions to create different effects
parameterize the macros
Document tools implemented as preprocessors:
tbl, eqn, pic, ideal, refer
Filter/pipe model, sequential only, one pass only (?)
Creating your own toolets
Style mechanism
Macro packages provide style
slide different macro definitions to create different effects
parameterize the macros
Layout mechanism
diversions implied multiple streams
recombine diversions as big boxes
pipe model limitations
Implementation limitations influence view of document formatting
collision of number register names, space limitations
author built system to alleviate many of the shortcomings of troff
TYPE, macros, math, tables, pagination
list of users: publishers (PH, Reston, UT Press, UW Press, UW CS Dept, SIAM, SIGGRAPH, Honeywell, UW Solid Mechanics Division)
macro programming language
register names
data structures grow dynamically
¶2.4.2 Scribe
form vs content made explicit, and rather difficult to override separation
document compilation made explicit
global solution to document composition, perhaps involving interative solutions and symbol table
lack of preprocessors
check with Brian about tbl clone
¶2.4.3 TEX
Knuth made document formatting legitimate computer science topic
global solution in one pass
can do everything although language is complex
penalties are indirect specification of desired results
Check out LaTeX and other preprocessors (macro packages)
--
¶2.5 Integrated composition systems
Etude (became Interleaf)
[MIT Reports, Shaw, Brader]
Janus (became IBM product)
now IBM product
[IBM Systems Journal article]
Star
office documents major focus [Xerox reports on Star, Seybold Report]
integrated several classes of objects
property sheets vs style sheets (attribute specifications, lack of indirection or naming, lack of scoping)
interactive user interface
Tioga
interactive, structured documents, WYSIWYG for display or printer
limited typesetting services (no footnotes, no floating figures)
extensible: user interface, client objects (artwork, photographs, tables)
WYSIWYG or WYSIAlmostWYG or WYSIAllThatYG (anything else is too hard)
Kelly's comment: how do I change style here?
How to control all this complexity?
Style specifications
abstract the attributes and parameterize the algorithsm
supply extensible specifications
future rule specifications could provide algorithms
--
¶2.6 Document models and views of documents (mumble ...)
flat vs structured
similar to batch versus integrated dicotomy
tree versus DAG
Engelbart NLS, Nelson Xanadu, van Dam Hypertext
distributed documents
data files
column order, sort row values to provide organization
my query formatter idea
rewriting rules for query matches to provide structure and formatting information
index generator
extract index entries and positions during formatting pass
Scribe symbol table approach
troff index file approach
my index tool for Tioga approach
node properties for index entry, additional properties for location and formatted location
operations on structured documents
replicate columns for finding things/viewing purposes [Phillips, Tabular Composition]
tick mark problem, adding finders (rules or space) every so many rows or entries [Malcolm, tick mark problem]
Cargill's notions of views
multiple views of information stored in one structure [Cargill]
providing redundant information (lister) for finding things (headers, contents, cross reference)
program visualization
Baecker & Marcus typesetting of C programs
other pretty printing examples
--
¶2.7 How to control complexity?
document production
laying out 2-dimensional information
how far can we push style mechanism?
to illustrations?
--
¶3 Graphical Style
3 GRAPHICAL STYLE
CONTENTS
3.1 Problems
3.2 Examples
3.3 TiogaArtwork
3.4 Results
3.5 Questions
--
[Make sure that I mention that this is mainly a previously published paper co-authored with Maureen Stone. Perhaps need to describe the situation that prompted the research in an introduction.]
3.1 Problems
revisit TOG reviews
changes in media: display — print — project
consistency among a set of figures for varying sources
"Every item in the book gains in appeal to the reader's eye from its relationship with all the other items. Something of a family resemblance, an appearance of being a set of pictures rather than a collection from disparate sets, may confer this advantage on the illustrations of any edition." [Hugh Williamson, Methods of Book Design, p 256]
"When the author's contract stipulates that he is to supply illustration copy, he may choose to draw it himself or get it drawn by somebody else whose main qualification for the task is that he will make no charge for it, or next to none. The resulting material may be clear enough to explain its meaning but incapable of adequate reproduction or too irregular in drawing to appear in a well-produced book." [Hugh Williamson, Methods of Book Design, p 258]
extended life of pictures
editing operations on illustrations similar to formatted documents
push on the style to control illustrations
Revisit section 1 discussion of illustrations, house style
Separation of form & content for illustrations
extend the document model to accommodate illustrations
--
3.2 Examples
PICTURE [Beatty, et al.]
PIC & IDEAL troff preprocessors
DRAW
no style
Griffin
explicit style
no indirection
STAR
idiomatic graphics, no extensions by graphic designer
ISSCO/TELAGRAF
business graphics package
subroutine package and design language
where is style in all this?
JUNO/GOB
constrained illustrators
interesting for the table formatting part of the thesis later on
no style in JUNO/GOB
--
3.3 TiogaArtwork
extend Tioga document model to illustrations
styles come along with that document model, extend style attributes to display primitives [GKS workstation attributes]
prototype text interaction only
3.4 Results
styles okay if tools for managing them were better {property sheets, style tool}
property sheet interface needed
layout/constraints necessary to permit changing geometry due to style changes or aspect ratio
detail suppression when scale changes or aspect ratio
3.5 Questions
How do you achieve consistency?
What tools make a graphic artist more effective?
Note: tools change the way a job is done
"if the only tool you have is hammer, the whole world tends to look like a nail" [Abraham Maslow]
Media requirements:
color, line weights, endings, intersections, shadows
. . . all the small details that become style parameters
--
¶4 Tabular Composition
4 TABULAR COMPOSITION
CONTENTS
4.1 Early table formatting systems
4.2 What is a table?
4.3 Why are tables hard?
4.4 Previous approaches
¶4.4.1 The Typewriter Tab Stop Model
¶4.4.2 tbl
¶4.4.3 TEX
¶4.4.4 TABLE
4.5 Samples of tables formatted
--
4.1 What is a table?
table is an arrangement of information
why are we interested?
move from illustrations and style issues into layout issues
concentrated layout problem is in formatting tables
table is a two-dimensional arrangement of rectangular pieces of information
math is 2-dimensional layout with alignment similar to tables
page layout is also 2-dimensional, like large scale table
the table is a form of communication
large amount of information in a small space
use some graphic arrangement, rules and leaders decorations to control the reading of tables
"Tables offer authors and editors a useful means of presenting large amounts of detailed information in small space. A simple table . . . can often give information that would require several paragraphs to present textually and can do so with greater clarity. Tabular presentation is not simply the best but usually the only way that large quantities of individual, similar facts can be arranged." [A Manual of Style, Chicago, 1969, p 273]
"ta-ble n. 13. An orderly written, typed, or printed display of data, especially a rectangular array exhibiting one or more characteristics of designated entities or categories. 14. An abbreviated list, as of contents; a synopsis. [Am. Her. Dictionary]
table typography is a hard problem to get right
there are lots of typographic factors
"The principles of table making involve matters of taste, convention, typography, aesthetics, and honesty, in addition to the principles of quantification." [Tabular Presentation, p 497]
"Tabular setting has proved both the easiest and the most difficult form of composition to bring under computer control. Because tabular setting is mainly for numeric data, it might seem strange that there should be any difficulty in providing computer-generated drive-tapes for photoset tables." [Arthur H. Phillips, Handbook of Computer-Aided Composition, p 189]
more than just computed data
tables appear in technical publications
"While many tables of physical and scientific data are being compiled by computer, there is still a requirement to include these data in technical publications because they are considered of interest to the reader who may not have access to the generating algorithms even if he is a computer user. The publication of such data in printed form may also be considered necessary to establish the status of the author! It would appear that the need for tabular composition in general bookwork will continue for some time." [Arthur Phillips, Tabular Composition, Seybold Report, August 1979, v8, n23, pg. 23-15]
various sources and purposes of table information
Financial spreadsheets from calculator program.
Financial reports.
Program generates voluminous data.
Extracts from a database.
author composes a simple table.
Unfortunately, almost anything can be a table!
--
4.2 Early table formatting systems
early typesetting systems did tables
computer generated the data
transciption problem with keyboarding errors
compatible interfaces existed
lead to computers setting type
mentioned briefly in ¶2, Corliss and Bozman, NBS 1962
typeset tables of computer generated data
computed on an IBM 709 and output on mag tape to a Linofilm typesetter
Barnett, 1965 had an early table formatting program, TABPRINT
IBM 709 and Photon 560 driven by paper tape
input fixed format data, columns of data reordered from data input; fixed specifications for each column; decimal alignment by splitting into two columns; advice from typographic design experienced personnel;mixed keyboard typography with tabular typography as expedient; lack of folding since lack of page composition; blank lines every 5 or 10 lines; symbol manipulation for grouping digits; wish for automatic means
simple table formatting problems led to simple solutions
all the typographic parameters were defined by the program
or simply applied routined from fixed definitions
"The significance of this early work in tabular composition is that all the typographic parameters were defined by program." [Arthur H. Phillips, Handbook of Computer-Aided Composition, p 195]
simple table formatting problems were the first of two categories of problems:
"But there are really two very different categories of tabular composition: One comprises a book of similar tables in which the values shown can be calculated by program algorithms from the minimum of data input, and the other consists of the tables appearing in technical texts. In the first case the style is similar for many consecutive pages, but in the second case each table, and there are sometimes several tables on the same page, has different column widths, different numbers of columns, and also ranges the entries differently, both vertically and horizontally; in addition, each table may have different complex box headings." [Arthur H. Phillips, Handbook of Computer-Aided Composition, p 189]
want to provide tools for the second category of tables
interactive tools
WYSIWYG
complete typographic control
so whats the problem?
4.3 Why are tables hard?
tables are complicated
"These complications will tend to keep interactive terminals employed for page make-up and with soft-copy proofs on page view terminals." [Arthur Phillips, Tabular Composition, Seybold Report, August 1979, v8, n23, pg. 23-11]
variety makes them hard since there is a lack of economics, mentioned earlier
two-dimensional makes them hard
dealing with areas, graphical relationships
typography sophisticated
lots of info in a small space
lots of customized ways of doing things
awkward because they are big: wide or long
¶4.3.1 Tables are Two-Dimensional
two-dimensions: rows and columns
familiar arrangement for finding information through row and column intersection
layout has to accommodate both in the table design
table width is accumulated column widths
content can flow from one box to another!
especially free form grid designs
structure to a conventional table [GPO reference]
box heads, stub head, column head, spanned headings
rules: head rule, foot rule, boxhead cutoff rule
lines or rows
--------------------
Grid 7 Rows 6 Columns ByRowThenColumn
Box (0,0) (3,1) FlushTop FlushLeft 3 bp 3 bp 6 bp 6 bp
Stub Head
Figure 4-1. THE STRUCTURE OF A TABLE can be
--------------------
table is an arrangement of arbitrary information
text certainly
numbers certainly
also illustrations, pictures, math equations, what else?
¶4.3.2 Typographic Treatment
fine resolution placement
normal books/text uses a relatively long measure
tables use relatively short measure
small spaces, smaller units between them
spreadsheets or any fixed pitch device like a typewriter has it easy since grid units fixed
typesetters have units 1/10th of a point (=1/720 inch)
"Tabular material is always difficult to typeset—much more so than to compose on the typewriter. This is true even though figures have a "monospaced" value. Letters do not, and therefore it is more difficult to align material or even to determine what will fit in a given space . . . The monospaced typewriter—where you can actually visualize what you are setting—is certainly the simplest way for the novice to proceed. And it will not be an easy task for the typesetter to imitate what the typist has done." [Seybold, Fundamentals of Photocomposition, p 14]
alignment choices
horizontal alignment of column information
flush left/right, center
align on character (for example, decimal point, multiplication sign)
implied decimal points
aligning terms in an equation; multiple alignments on operators and =
eqn markup, lineup, and matrix tools are primitive in this regard
vertical alignment
flush top/bottom, center optically
baseline alignment
top/bottom, center, center on top/bottom for even baselines (avoiding scabbards half-line spaces [GPO, pg 189])
alignment within set of entries in row/column
spanned headings across several row/columns
equal space for row/columns independent of content
balancing folded columns to use up whitespace
styles apply to table structures
table environment attributes, like Scribe
different point size, often smaller to attract less attention and to set mor information!
row or column distinguished, bold or italic typeface
single table entry distinguished, size, font, bold or italic for emphasis
whitespace treatment
bearoff distances between table entry and rules
gutter distances
intrusions of glosses or footnote reference marks
spreading whitespace when spanned columns
rules and decorations
rules along row/column boundaries
"Ruled tables, for example, are usual in the publications of this press, in part because Monotype composition has always been readily available. For a publisher who is restricted to Linotype, open tables or tables with horizontal rules alone may be the only practical way tabular matter can be arranged." [Manual of Style, Chicago, 1969, p 273] [it would be interesting to compare the 13th edition on this suggestion!]
rule typographic variations
thicknesses: hairline rule, point rules, 2-pt, 3-pt
rules of different weights; medium rule between table heading and table, and below table; fine rule between column headings and columns [Williamson, Methods of Book Design, p 159]
patterns: double rules, thick-thin rules
braces: horizontal and vertical within gutter/bearoff distances
ornaments at corners, rounded corners
background tints, like CACM, example in Figure 4-1 for shading areas
leaders
dots or dashes that lead your eyes
congruence to align repeat patterns
leaders cross column boundaries
footnotes
within table entry
collected at bottom of table
numbering issue for footnotes
treatment with respect to footnotes in the running text of the mss
readability issues
separating every so many lines with space or rules or background tints
grouping digits in long numeric entries
¶4.3.3 Tables are Awkward
large tables
fitting tables to pages
Special treatments to make a table fit.
Rotate headings to typeset vertically to reduce column width. Possibly set text vertically with characters horizontally (vertical stack arrangement).
table orientation normally upright
column headings that are much wider than rest of column turned sideways with descenders to the right [Williamson, Methods of Book Design, p 159]
Reduce size of type within table.
tables in smaller type, although majority in text size and minority in smaller size preferred [Williamson, Methods of Book Design, p 159]
Compress text horizontally to make characters narrower (reduce the set size)
Reduce whitespace bearoff to make table fit.
transpose columns and rows to make table fit
transpose table if table consists of more columns than rows [Williamson, Methods of Book Design, p 159]
Phillips also fantasizes about computer programs to do this
mention concerns of statistical data [Leisel, Say It With Figures, p 41]
Tables may be larger than a single page:
Wide tables may be printed broadside, rotated 90 degrees (either way), so the long table dimension is along the long paper dimension.
tables wider than page turned sideways so rows read to the right, column headings that are much wider than rest of column turned sideways with descenders to the LEFT otherwise they would appear UPSIDE DOWN! [Williamson, Methods of Book Design, p 159]
recto pages are preferred since a turned book will present the recto page closer to the reader. [Williamson, Methods of Book Design, p 271]
Tables may be laid out as facing pages in a two-page spread.
tables spread across facing pages (opening) need some form of linking; table too big printed as a folded plate! [Williamson, Methods of Book Design, p 159]
Tables may be continued on several subsequent pages.
"It would be asking rather a lot of a page make-up program to insert carried forward and brought-forward totals automatically at a table break, and indeed these were often omitted when tables were made-up by the hand compositor" [Arthur Phillips, Tabular Composition, Seybold Report, August 1979, v8, n23, pg. 23-11] [referring to the introduction of continued lines at points where table columns are broken]
trivial for spreadsheet programs with great understanding of the table structure
simple row entries
financial break lines have penalty function similar to page breaking
split tables may not split within a column or row
perhaps column or row is repeated for readability
columns may be folded to make them fit.
Boxheads (set of column headings) or Stubs (set of row headings) may need to be repeated if table is continued. Continued headings (add the text "continued") may be necessary for such tables.
scrolling horizontally or vertically when displaying them on screens
Style rules on tables
many parameters supplied by compositor
table entries often broken by compositor
English is a lousy programming language [Hill]
publishers have never had to be specific when dealing with people in the graphic arts; now they do, they find it frustrating to provide all that detail, especially when some of their favourite choices are not provided
design guidelines on visual highlights
e.g. rules or extra vertical space every 5 rows, separating digits in math tables
--
4.4 Previous approaches
¶4.4.1 The Typewriter Tab Stop Model
fixed pitch characters is a key simplification
permits a coarse grid with complete specification of character positions
spreadsheets take advantage of this with fixed position grids
lends itself to measurements as character counts
typewriter tab stops
escapement mechanism
carriage let fly to next stop
starting point for typing, therefore flush left only
TTY and early computer terminals had tab stops every 8 characters
document formatters support extension of the typewriter tab stop
different semantics of tab characters
Runoff, troff: left right center on tab stops
Scribe: left, right, center between tab stops (curious)
--------------------
Grid 4 Rows 5 Columns ByRowThenColumn ColConstraints
ColConstraint 2*gx2 - 1*gx3 - 1*gx1 = 0
ColConstraint 2*gx3 - 1*gx4 - 1*gx1 = 0
ColConstraint 2*gx4 - 1*gx5 - 1*gx3 = 0
Box (0,0) (1,1) TopBaseline FlushLeft 12 bp 24 bp 6 bp 6 bp
Runoff/troff
Box (0,1) (1,2) TopBaseline FlushLeft 0 bp 12 bp 6 bp 6 bp
left
Box (0,2) (1,3) TopBaseline FlushRight 12 bp 0 bp 6 bp 6 bp
right
Box (0,3) (1,5) TopBaseline Center 6 bp 6 bp 6 bp 6 bp
centre
Box (1,1) (2,2) TopBaseline FlushLeft 0 bp 12 bp 0 bp 6 bp
^
Box (1,2) (2,3) TopBaseline FlushRight 12 bp 0 bp 0 bp 6 bp
^
Box (1,3) (2,5) TopBaseline Center 6 bp 6 bp 0 bp 6 bp
^
Box (2,0) (3,1) TopBaseline FlushLeft 12 bp 24 bp 6 bp 6 bp
Scribe
Box (2,1) (3,2) TopBaseline FlushLeft 0 bp 12 bp 6 bp 6 bp
left
Box (2,2) (3,3) TopBaseline FlushRight 12 bp 0 bp 6 bp 6 bp
right
Box (2,3) (3,4) TopBaseline Center 6 bp 6 bp 6 bp 6 bp
centre
Box (3,1) (4,2) TopBaseline FlushLeft 0 bp 12 bp 0 bp 6 bp
^
Box (3,2) (4,3) TopBaseline FlushRight 12 bp 0 bp 0 bp 6 bp
^
Box (3,3) (4,5) TopBaseline Center 6 bp 6 bp 0 bp 6 bp
^
Figure 4-4. TAB STOPS of table entries
--------------------
awkward to attach formatting attributes to tab stops
font, size, indenting
no line folding, since this implies returning to the previous line for subsequent tabbed material!
especially since tab stops do not define a region but only an origin
tab stops provide only limited functionality and few typographic features
now onto something real
¶4.4.2 tbl
general preprocessor for troff
accepts a table definition language
table topology part
table content part
spanned headings okay, although asymmetric
folded input lines okay although user specifies width
formatting attributes on entries by column
rules around all boxes or explicitly when asked for
vertical and horizontal rules
can set most every layout of table
topology from row/col entry layout
anomaly: col span in topology, row span in data
not symmetric with respect to rows and columns
geometry solver using troff registers
limitations
large tables continued headings
number/string registers 2 chars
no algorithm for folded lines; requires column measure to be supplied
recursion impossible: table within a table, equation — table — equation
¶4.4.3 TEX
macros to provide similar functionality to tbl in LaTEX
chapter 22 of the TeXbook
¶4.4.4 TABLE
[Biggerstaff, et al] front end to tbl
object-oriented representation
study editing complex structures, tables are complex structures
layout of idiomatic graphical structures
editing primitives
cursor movements within structure
workspaces - general purpose sharing of data
selection granularity
table to characters, movement in/out granularity
TABLE lacks logical structure of table, only topology
provides no operations on logical structure
perhaps induced by tbl
data structure alternatives
1) tree; asymmetric
2) matrix: symmetric, sizing must analyze all entries in row/column
--
4.5 Samples of tables formatted
the perfect table example (or series of examples)
tbl
inlcude the standard tbl manual examples
TEX, Scribe, Star, Janus
Government printing office style manual
Simultaneous translation as a table
[Tabular Presentations, pg 502]
two-dimensional array of percentages
Grid 7 Rows 7 Columns ByRowThenColumn GridOverlay 0.5 bp 0.0 0.0 0.5 EmptyNodeTemplate RowConstraints ColConstraints
empty
RowConstraint 2.0*gy2 - 1.0*gy1 - 1.0*gy3 = 0
RowConstraint 2.0*gy4 - 1.0*gy3 - 1.0*gy5 = 0
RowConstraint 2.0*gy6 - 1.0*gy5 - 1.0*gy7 = 0
ColConstraint 2.0*gx2 - 1.0*gx1 - 1.0*gx3 = 0
ColConstraint 2.0*gx4 - 1.0*gx3 - 1.0*gx5 = 0
ColConstraint 2.0*gx6 - 1.0*gx5 - 1.0*gx7 = 0
Box (0,0) (1,1) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (0,1) (1,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (0,3) (1,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (0,5) (1,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (1,0) (2,1) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (1,1) (2,2) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (1,3) (2,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (1,5) (2,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (2,2) (3,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (2,4) (3,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (2,6) (3,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (3,0) (4,1) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (3,1) (4,2) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (3,3) (4,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (3,5) (4,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (4,2) (5,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (4,4) (5,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (4,6) (5,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (5,0) (6,1) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xxxxx
Box (5,1) (6,2) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (5,3) (6,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (5,5) (6,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (6,2) (7,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (6,4) (7,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
Box (6,6) (7,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx%)
[Tabular Presentations, pg 505]
spanned table headings
Grid 6 Rows 7 Columns ByRowThenColumn GridOverlay 0.5 bp 0.0 0.0 0.5 EmptyNodeTemplate RowConstraints ColConstraints
empty
RowConstraint 2.0*gy3 - 1.0*gy2 - 1.0*gy4 = 0
RowConstraint 2.0*gy5 - 1.0*gy4 - 1.0*gy6 = 0
ColConstraint 2.0*gx4 - 1.0*gx3 - 1.0*gx5 = 0
ColConstraint 2.0*gx6 - 1.0*gx5 - 1.0*gx7 = 0
Box (0,3) (1,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (1,3) (2,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (1,5) (2,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (2,0) (6,1) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (2,1) (4,2) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (4,1) (6,2) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (2,2) (3,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (2,3) (3,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (2,4) (3,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (2,5) (3,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (2,6) (3,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (3,2) (4,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (3,3) (4,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (3,4) (4,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (3,5) (4,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (3,6) (4,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (4,2) (5,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (4,3) (5,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (4,4) (5,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (4,5) (5,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (4,6) (5,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (5,2) (6,3) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx
Box (5,3) (6,4) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (5,4) (6,5) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
Box (5,5) (6,6) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
xx%
Box (5,6) (6,7) Center Center 3.0 bp 6.0 bp 3.0 bp 3.0 bp
(xx)
--
¶5 New Framework for Tabular Composition by Computer
5 NEW FRAMEWORK FOR TABULAR COMPOSITION BY COMPUTER
CONTENTS
5.1 The Interactive Table Formatting Problem
¶5.1.1 What we need to do
¶5.1.2 How are we going to do it
5.2 Table Document Structure
5.3 Implementation data structure
5.4 Constraints
5.5 Alignment & breaking algorithms
¶5.5.1 Complexity of table layout algorithms
¶5.5.2 Storage costs of these algorithms
5.6 Some Subproblems
--
¶5.1 The Interactive Table Formatting Problem
This chapter proposes an new framework for formatting tables that is suitable for interactive editors and displayers. The framework integrates an object-oriented approach to document content by extending the document structure model used in the TiogaArtwork prototype in Chapter 3. The new framework supports a wide range of typographic requirements for typesetting tables through extensions to the document style mechanisms. This framework specifies table layouts through the use of grid designs familiar in the graphic arts and through the use of a constraint solver that determines the precise placement of table entries. The framework supports interaction by enabling a WYSIWYG display, pointing selections to structured components within a table, and direct manipulation of the table content and arrangement.
¶5.1.1 What we need to do
To support interactive table formatting, one needs to provide mechanisms for WYSIWYG display of the table, for direct manipulation of the table content and structure, for pointing at and selecting components of the table, and for efficient algorithms for all of these facilities. Tables may be represented in an electronic document by a specified arrangement of table entries. Each of the table entry can be an arbitrary document object. Typographic rules and decorations on tables generally are placed between table boxes and can be handled by a mechanism that models the placement and the intersections of boundaries between entries.
Tables are rectangular arrangements of information that possess a symmetrical structure with respect to rows and columns. Operating on a table may require dealing with a row or a column at various times. Thus operations should be symmetrical for rows and columns, making it equally easy to select a row or a column and to perform a movement or formatting operation. Selection hierarchies should work equally well for groups of rows as for groups of columns. Transposing the rows and columns of a table should be possible.
Style attributes for formatting tables can be provided in an analogous manner to graphical style, but the additional nonhierarchical structure in tables pose some difficulties. Style attributes may be applied at the entire table, or to table entries in a single row, single column, spanned row or column, or to the individual table entry. A method is needed to determine the appropriate style attributes for each table entry when it is formatted.
The table layout mechanism needs to handle general table arrangements and the alignment of table entries within a table design. Spanning a column heading across several columns is a special case of the general problem or aligning a set of entries with another set of entries. The new framework for table formatting must be capable of expressing these arrangments and of determining the positions of the table boxes in the arrangement.
Another potential generalization of the table formatting problem is to provide for overlapping layers of information. Background tints, such as the coloured tints used in tables published in recent issues of the Communications of the ACM, are one class of layer information. The layering of background tints, scanned images, and illustrations within a table format may be possible through the table formatting framework.
¶5.1.2 How are we going to do it
The new table framework has three parts: document structure extensions, grid layout specifications, and constraint satisfaction.
The document structure and style machinery for tables is an extension of the model proposed in Chapter 3. Each table entry is a document object. Recall that objects contain anything which can format itself and that objects are represented by a set of dimensions and two procedures for laying out and rendering the object. The style attributes may collected into formatting rules and dictionaries and attached as properties to the table entries. Since a table does not have the same simple hierarchy as the tree-structured document structure, a more elaborate binding algorithm is needed to associate style attributes with the table entry.
The table arrangement, or table topology, can be expressed on a grid. Each grid line either surrounds the table or is the boundary between table entries. Rules and decorations can be superimposed on the grid boundaries which may imply that the grid lines have a nonzero width equal to the widest rule on that grid line. The arrangement of table entries will be expressed with respect to the coordinates of the grid system for the table. Operations that change the table topology will have to make corresponding changes in the grid system.
The table layout, or table geometry, can be computed from the table arrangement and the sizes of all of the table entries. In this new framework, a linear inequality constraint solver is used to provide a general mechanism for computing the table geometry. Grid lines are given symbolic names and table entries have dimensions. These are combined to express the table arrangement as a set of linear inequalities. The constraint solver then can solve the resulting system of constraints to produce the coordinates of the grid line positions. Interactive use of the table formatter demands that changes to the table be handled efficiently and in an incremental fashion. Such a constraint solver is used in this scheme.
--
5.2 Table Document Structure
¶5.1.3 Table document structure
note that a table is a two-dimensional arrangement of information
must make it fit into the Tioga tree-structured document (with parent child relationships)
how to do this?
rows and columns are obvious place to start
choose one as dominant
e.g. Star, tbl both use rows as dominant, therefore columns on the same input line
what problems do we get into here?
selections across the non dominant direction are a pain
asymmetric treatment of the table structure
operations specified differently for spanning rows than columns
and we should avoid this
perhaps 2 independent hierarchies for rows versus columns
e.g. CS740 project
results in complex implementation issues, data structures
crucial point is that some table designs are not hierarchies
need a more general underlying structure
grid/lattice of boundaries
handle arbitrary complexity of lattice arrangements on one plane
layer table to provide multiple planes
can still use same boundaries, important observation
style binding for nonhierarchical tables
define search rules: table, row/column, entry
disambiguate row/column
templates for tables
style determines how table is formatted
templates provide a sample or predetermined table design
================== Section really starts here =================
recall the document structure extensions of Tioga for the TiogaArtwork experiment outlined in Chapter 3
it was a tree structured hierarchy
there were classes of node content: text, added illustration nodes
content was recursive: text document with illustration with text labels
now add a new class of document object: table
still recursive content: tables within illustrations within tables
permits general and extensible treatment of content within doc and table
Table Entry Representation
table object has dimensions and two procedures: layout and render
layout determines the table geometry from the table topology
solves the layout constraints
render can quickly display the table
device independent rendering produces displayed and printed tables
suitable for interactive WYSIWYG display when window moves/scrolls
no changes, no need to recompute layout parameters (win!)
define table entry to be a box with 4 dimensions: left, right, up, down
--------------------
///Beach/Thesis/ThesisFigure5-1.Press
leftMargin: 2.1 in, topMargin: 1.7 in, width: 1.5 in, height: 66 pt
Figure 5-1. A TABLE BOX is represented by four dimensions that are the left, right, up, and down distances from an alignment point.
--------------------
specification of the alignment point will involve the contents inside the table entry box
textual contents may be single or multiple lines
nontextual contents may be easier to determine alignment
centre: divide dimensions by two
flush left, right, top, bottom: set one dimension to zero
align on character within box
decimal points, multiplication signs
--------------------
Grid 2 Rows 5 Columns ByRowThenColumn
Figure 5-2. OPTICAL CENTRING of lines within a table may not produce aesthetic results when table entries have several lines of text. All of the entries in the top row are optically centred; all of the entries in the bottom row have been centred on a selected baseline.
--------------------
multiple lines of text are okay as objects
line length might be supplied as a formatting attribute as a hint
sometimes need to get inside the box for certain alignments
optical centering is not always pleasing when unequal number of lines
--------------------
Grid 3 Rows 2 Columns ByRowThenColumn
Rule (0,0) (0,2) 2 bp
Rule (1,0) (1,2) 2 bp
Rule (2,0) (2,2) 2 bp
Rule (3,0) (3,2) 2 bp
Rule (0,0) (3,0) 2 bp
Rule (0,1) (3,1) 2 bp
Rule (0,2) (3,2) 2 bp
Box (0,0) (1,1) Center CharAlign '. 3.0 bp 6.0 bp 3.0 bp 3.0 bp
0
Box (1,0) (2,1) Center CharAlign '. 3.0 bp 6.0 bp 3.0 bp 3.0 bp
.625
Box (2,0) (3,1) Center CharAlign '. 3.0 bp 6.0 bp 3.0 bp 3.0 bp
1023.5
Box (0,1) (1,2) Center CharAlign 'X 3.0 bp 6.0 bp 3.0 bp 3.0 bp
speed time
Box (1,1) (2,2) Center CharAlign 'X 3.0 bp 6.0 bp 3.0 bp 3.0 bp
acceleration time
Box (2,1) (3,2) Center CharAlign 'X 3.0 bp 6.0 bp 3.0 bp 3.0 bp
force distance
Figure 5-3. ALIGNMENT within a table entry may be based on specific characters within lines of text.
--------------------
table must represent the arrangment of table entries
arrangement is two dimensional
like math in small and page in large
but unlike paragraph
must make table fit into tree-structured document model
represent objects by boxes
might wish to access baseline within box
topBaseline, bottomBaseline, centerTopBaseline, centerBottomBaseline
decimal or character align within boxes
perhaps ask the box for its origin given these alignment wishes
perhaps assume structure of the box (???)
topology of table is grid with coordinates
provide mapping from boxes to layout coordinates
interaction requires mapping from layout coordinates to boxes
coordinates ok since direct manipulation hides any use of them
no explicit reference necessary to coordinates to create or manipulate table structure
style attributes for table comes from several sources
support for document style extended to tables
table, general table attributes
row/column, attributes defined due to membership in table structure
cell, local attributes
how to disambiguate row/column preference?
table attribute: ByRowThenColumn BOOLEAN
edits to document perserve the arrangements
inserting a row or deleting a column
deal with references to the topology for the user
references can be symbolic to reduce the changes necessary in the data structure
symmetric with respect to rows and columns
change the structure and create new representation through a tree-walk
provide selection on rows and column easily
something TABLE struggled with and avoided eventually
TiogaArtwork Graphical Style reference
recursion potential from boxes model for layout
--
5.3 Grid Systems
¶5.3.1 Graphic Arts References to Grid Systems
grids have been around a long time
provide a disciplined approach to design and layout
limit the number of choices among a vast set
provide regularity in chaotic situations
not universally accepted
--------------------
Figure 5-8. GRID DESIGN for the pages of this thesis illustrates the traditional use of boundary lines to determine margins, column measures, gutter widths, alignment points, etc. Draw a Griffin picture with margins and shaded text area, complete two-column lines for index.
--------------------
Hurburt's book, The Grid
modular design, sense of proportions
orthodox grid: uniformly spaced horizontal and vertical lines to produce a square module
casual typographic grid: identify the margins, column measures, etc.
letter design: construction of letter forms
proportions
golden section: F a:b = b:(a+b)
square: 1:1
2 rectangle: 1:2, basis for A-size standard paper sizes
modular: Le Corbusier
grids provide options and opportunities
``grid is an aid, not a guarantee''
Muller-Brockman, Grid Systems in Graphic Design
seems to be the grand old man of grid systems
Williamson, Methods of Book Design
Section 9-10: Vertical position
typical text page offers different vertical levels for alignment:
1) headline, 2) first line of text, 3) chapter number, 4) first line of chapter title, 5) first line of text in a chapter, 6) last text line, 7) footline
illustration often aligned to these levels
Tilbrook's A Newspaper Pagination System
an example of early interactive computer system that used grids
grids called mask lines
interactive manipulation of newspaper grids for page dummying
no constraint satisfaction, and only limited text composition capabilities
¶5.3.2 Tables Described by Grids
grids promote concept of areas within tables
grid lines are the boundaries between table boxes
boxes "stapled" to the grid lines, Figure 5-29
--------------------
Grid 4 Rows 5 Columns ByRowThenColumn GridOverlay 2 bp 0 0 0.5
Rule (0,1) (4,1) 2 bp
Rule (3,0) (3,5) 2bp
Box (0,1) (1,5) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
XxxXxxxxXxx
Box (1,1) (2,3) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
Xxxx
Box (1,3) (2,5) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
Xxxx
Box (2,1) (3,2) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (2,2) (3,3) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (2,3) (3,4) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (2,4) (3,5) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (3,0) (4,1) TopBaseline FlushLeft 3.0 bp 3.0 bp 6.0 bp 6.0 bp
XxXxxXxx
Box (3,1) (4,2) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (3,2) (4,3) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (3,3) (4,4) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Box (3,4) (4,5) TopBaseline Center 3.0 bp 3.0 bp 6.0 bp 6.0 bp
xxx
Figure 5-29. TABLE DESCRIBED BY A GRID where the gray lines are the grid lines. Note that some column heads cross through one or more grid lines.
--------------------
areas capture table structure in a symmetric way for both rows and columns
box 4 grid lines; row/column 2 grid lines
grids promote treatment of rules and decorations explicitly and symmetrically
data structure must represent areas, boundaries, and be fast
table topology versus table geometry
can abstract out the actual size of table objects
deal only with the arrangments between table entries
conversion to geometry will be second pass algorithm
map table topology into an area data structure
use corner stitching [Ousterhout, VLSI data structure]
tiles for both table entries and rules
Figure 5-31 shows corner stitched data structure
--------------------
///Beach/Thesis/Figure5-31-CornerStitching.Press
leftMargin: 1.5 in, topMargin: 1.75 in, width: 4.5 in, height: 2.25 in
Figure 5-31. CORNER STITCHED DATA STRUCTURE uses tiles that are joined together by four pointers, two at the NorthEast corner going North and East, and two at the SouthWest corner going South and West.
--------------------
make boundaries explicit areas
intersections become obvious when you do this
linear or sublinear algorithms for most operations
make coordinates integer grid intersections
Figure 5-30 shows areas in grid structure
--------------------
Figure 5-30. TABLE MAPPED ONTO GRID DATA STRUCTURE where the tiles represent table boxes and rule elements for Figure 5-29.
--------------------
possible to enumerate areas by traversing the corner stitches
used by layout and painting algorithsm
useful for selecting rows and columns
since its areas, there is symmetry between rows and columns (yea!!)
¶5.3.3 Overlapping Planes
what can be overlapped?
table entries, rules, backgrounds
backgrounds are coloured areas determined by the grid boundaries too
establish overlap priority and render back to front
backgrounds, then rules, then table entries
table entries that overlap can be represented in the data structure using the same grid boundaries; no big deal!
rule intersections provide corner capabilities (rounded, butt, mitred, multiple rule, border patterns and ornaments)
intersections result of area overlap, keep a list of intersecting parts and resolve after its known what rules intersect
future work to exhaustively implement the capabilities shown in Figure 5-9
lines of different thicknesses
lines of different colours
lines with repeating patterns such as dashes or border patterns
lines with ornaments at the end
rounded corners where lines intersect with sufficient clearance
--------------------
Figure 5-9. INTERSECTING RULES in a table may require several special treatments. Lines of different thicknesses; lines of different colour; lines with repeating patterns (dashes, borders); lines with ornaments at the end; rounded corners where lines intersect with sufficient clearance;
--------------------
¶5.3.4 Grid Algorithms
enumerate area
corner stitching data structure provides enumerate within rectangle
act on each entry enumerated by call-back procedure
resolve algorithm
given layout x,y coordinates, find table box
traverse the grid data structure in a directed search
layout algorithm
given grid data structure, find the layout x,y coordinates of each box
use linear inequality constraints to express relationships among boxes
solve constraints for grid boundary and box positions
paint algorithm
given layout positions and overlap order, render table boxes
layers of information supported by ordering the rendering
1) background
2) rules, decorations
provides general treatment of rules, corners, braces, decorations
3) boxes of information
device independent rendering gives WYSIWYG display and printing
minimal repainting extension possible
by keeping state about the changes
traversing the data structure to find boxes that changed position
moving boxes possible by blitting them around
change in content requires repainting box
delete algorithm
given grid data structure and two grid lines, delete all the table entries between the grid lines
enumerate all table entries below top grid line of the two given lines
if box within two grid lines, then delete box
else if box is attatched to the bottom grid line, then bottom grid becomes top
else modify the box grid coordinates of both top and bottom
fix up the specific constraints
fix up the row/column specific styles
insert algorithm
given grid data structure and a grid line, create a new grid line before/after the given grid line
enumerate all table entries below the given grid line
if box top grid is the given grid line, then make box top the inserted grid
else box grid beyond given grid, then modify the box grid coordinate
fix up the specific constraints
fix up the row/column specific styles
how to enumerate data structure to guarantee the delete/insert algorithm works in the correct order?
use CornerStitching snowplow operation
what is the time complexity of these algorithms?
--
5.4 Constraints
Constraint Systems
Borning's Thinglab examples
document layout, p 29
document content
illustration shape
Greg Nelson's constraint solver from his thesis [Nelson, Program Verification] provides the engine for this prototype
JUNO illustrator, uses nonlinear constraints
what we are going to use constraints for?
specify layout of table rows, columns, and entries
capture the arrangement of table structure
express the relationships that must be preserved
provide a means to determine the layout
provides extensible means to customize designs
for example equal widths or making something symmetric
we will use linear inequality constraints
all the distance relationships are linear
inequalities permit centering and leaving sufficient room for large elements
slack variables useful for interaction by providing bound on changes without expensive recomputations
that is, we don't have to solve again until slack is all used up
interaction user interface make use of incremental property
mark current state of constraint tableau
make some provisional changes
if infeasible, then back out these changes to resume previously saved state
leave this for user interface exploration in the future
the general problem
given the table arrangement (of what is next to what)
given the formatting attributes, especially the alignment choices
find the positions of each box and grid boundaries
boxes within a row or column align on a `center' line
assume independence of horizontal and vertical constraints
implies that horizontal and vertical constraints can be solved separately, thereby reducing the size of the tableau
not true when balancing white space (perhaps described later in the thesis)
linear inequalities needed
slack for various sizes of boxes and spanned headings
illustration showing slack possibilities, set of decimal aligned boxes, set of spanned column headings over short and long column entries
constraint equations
diagram the various possibilities and identify the equations for each
alignment within grid lines: horizontal and vertical
position within grid lines: center
equal size columns or rows as additional constraints
threads between boxes
currently kept outside the boxes in a separate list of constraints to add
ordering columns or rows to force a solution
(I think this is necessary but I haven't thought through when yet)
fine points on box alignment
boxes within grid lines
center, flush right, flush left
but also need to calculate the origin within box dimensions
center, flush left, flush right
decimal align, character align (e.g. multiply sign)
extensible constraints
permitting new labels to be created for adding those additional constraints
for example, make this column width equal to twice that column
or align this box with this column boundary under a spanned heading
the algorithm
step 1: determine the dimensions of each box and cache them
they are used for both rows and columns
step 2: determine the row positions
step 2a: establish the system of vertical constraints
step 2b: solve the constraint system
step 3: determine the column positions
step 3a: establish the system of horizontal constraints
step 3b: solve the constraint system
handling large tables
constraint tableau is (rows)x(columns)
observe structure of the constraint equations:
grid1 + box.width >= grid0; for each box between grid0 and grid1
search for maximum constant value and replace the set of equations with:
grid1 + MAX[box.width for each box] >= grid0
constraint tableau is banded, since constraints generally affected locally
maybe win by solving equations with sparse matrix techniques
maybe win by solving for some variables first
cleavages are places to break tables
where a grid line does not cross a table entry
for example a row or a column grid line
notion of kerfs from Plass or Knuth
kerf is how to get repeated boxheads or stubs
--
¶5.5 Complexity of Table Layout Algorithms
if topology of table not given, then finding the optimal layout is NP-complete
similarity to bin packing algorithms [Garey & Johnson]
Random Pack
2k pictures, rectangular, width less than 1/2 page width, heights random
arrange in two balanced columns, each of length k
Random Pack is in NP because the arrangement can be checked in linear time by summing the heights in the two columns and comparing
reduce Random Pack problem to Partition problem known to be NP-complete
Partition set of 2k integers (heights of the pictures) into two sets S1 and S2 such that S height(s1) = S height(s2)
weak NP-complete result
if bound on size of pictures, then pseudo-polynomial time algorithm
Stub Pack
n pictures, rectangular
place a horizontal rule underneath a picture
one a rule placed, it extends all the way to the right
greedy algorithm equivalent to sorting suffices
Lattice Pack
given topology as lattice of rectangular pictures
arrangement in rows and columns
linear algorithm verifies the arrangement
Constraint Solver
linear programming problem, uses the Simplex method
known to be exponential worst case, but n3 average case
linear time algorithms known but no implementations available
simplified problem when number of variables restricted [GNelson's thesis comments in his section 12]
grids and constraints like my stuff
¶5.5.1 Storage costs of these algorithms
How to count table entries? r c versus n = n n
Formatting attributes indirected by style name
--
5.6 Some Subproblems
folding large width of columns to fit page measure
breaking long tables
breaking wide tables
conjecture:
2-dimensional conservation property of text
TeX experiment
conjecture:
statistical balancing of white space
column should be within 1 standard deviation of width of boxes
entries outside 1 standard deviation should be folded
apply this test when determining the grid constraints
when maxing the box dimensions that apply to a grid boundary
becomes max of all boxes within a column
perhaps a few more constraints generated for entries that are in different columns but share grid boundaries
--
¶6 Future Directions
6 FUTURE DIRECTIONS
¶6.1 Extensions of Table Formatting Algorithms
Extend tables to math notation (small) and pages (large)
Several algorithm problems
¶6.2 User Interface Issues
Interaction and user interface issues
Newswhole paradigms [Tillbrook]
Summary of points in Phillip's article:
— specifying nil entry contents (editing operation)
— interactive skip between column entries (table property: enumerate entries by row or column major order)
¶6.3 Document Structure Issues
Objects (pictures) that size themselves
replicating scan lines, programmed variations of synthetic images
control over scaling, aspect ratio, dot size, maximums of these values
--
¶7 Glossary
7 GLOSSARY
see Glossary.Tioga
--
¶8 References
8 REFERENCES
annotated bibliography, if I can get my references out of Squirrel in a formatted fashion
--
¶9 Index
9 INDEX
don't forget this . . . perhaps I shall have to revive my IndexTool to do the indexing job properly
--
¶10 Deliverables
10 DELIVERABLES
Typeset
formatter, macro package, math package, table macros
provided the basis for an operating company to typeset scholarly books and journal articles
I could list a bibliography of such typeset materials to demonstrate the range of complexity:
1977 ICCH-3 proceedings: 7 week wonder going from manuscripts to case bound book in advance of the conference; foreign languages (French and Greek), mathematics, statistics tables)
1978 WATFIV-S Introduction to Computing: computer programs included from machine readable files
1979 Sparse Matrix book: matrix algebra notation
1981 Techn
TiogaArtwork
prototype demonstrated in the 1982 videotape "Graphical Styles"
prototype functionality:
convert Griffin illustrations to TiogaArtwork text-based format
define graphical extensions to the Tioga style machinery
required changes to the Tioga formatter as well as extensions to the style attributes supported by Tioga
extended the typesetting software to accommodate client-supplied classes of document content
including TiogaArtwork, scanned images, preformated printer image files (Press)
this class concept permits recursive inclusion of illustrations with text that might contain other illustrations, ad nauseum, in the Tioga document structure
provide rendering software that produced images from the Griffin geometry according to the graphical style parameters
lines, areas, colors, text, shadows
borders, defined by parametric mapping prototype software, were not integrated with the TiogaArtwork prototype
but they explored the concept of defining more complex line styles
used to prepare the illustrations for the 1983 SIGGRAPH paper "Graphical Style — Towards High Quality Book Illustrations"
concept to be integrated with current revisions to the Tioga formatter used in Cedar
style machinery will be reworked to provide better user interface to define new styles and families of styles
TableTool
prototype used to prepare all the tables in this thesis
prototype functionality:
accepts artwork class for formatting tables
defines a text-based table description
provides an explicit grid topology for the table entries
accepts constraint equations on the placement of grid lines
accepts any class of document content supported by Tioga (text, TiogaArtwork potentially, scanned images, other tables)
provides for rules and other "along the grid line" decorations
provides for tinted backgrounds
a proptype user interface has been designed (but it is currently broken)
concept will be integrated with current revisions to the Tioga formatter used in Cedar for passive display of tables within Tioga documents
interactive user interface will be developed to permit editing of table structures, styles, and layout
Videotape Demonstrations
TiogaArtwork in the existing Graphical Styles 15 minute videotape
TableTool in a yet-to-be-created tape that demonstrates the
--