Scripted Documents
(*DRAFT*)
Polle T. Zellweger
Computer Science Laboratory
Xerox Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, CA 94304
415/494-4426
Abstract
We have developed a novel mechanism that we call a script, which provides a way to layer additional structure on top of an electronic document or set of documents. A script is a directed path through a document or set of documents that need not follow their linear order. The scripting mechanism is all the more powerful when combined with other capabilities of online multi-media documents, for they can incorporate sound, images, and actions.
Built on top of conventional document (file) system, gives much of feel of hypermedia system [vanDam85]. Branching scripts are like cross-referenced documents w multiple links (but choice can be made by program rather than user). Certainly more than hypermedia in automatic mode; seems like different styling even in single-step mode. One difference is that scripts can visit the same location multiply with different actions.
Introduction
Suppose that you are writing a technical manual or some other document with a complex internal structure. Using traditional methods, you would be forced to linearize your ideas to fit into a single front-to-back, top-to-bottom sequence of words and pages. Additional structure could be expressed only as a textual list of cross-references from one section to other related sections. In addition, this single linear sequence must serve all of the document's readers. To combat this, books and manuals sometimes contain reading instructions in the preface, explaining the order in which different types of readers should peruse the material. For example, they might instruct readers who are unfamiliar with the subject matter to read the appendix first and to skip the second chapter and skim the third and fourth chapters on the first reading. More knowledgeable readers might be instructed to read the document straight through.
We have developed a mechanism that we call a script, which provides a way to layer additional structure on top of an electronic document or set of documents. A script is a directed path through a document or set of documents that need not follow their linear order. Each entry in a script consists of a document location, which is a contiguous sequence of characters, together with an associated action and an associated timing. A sample action might consist of playing back a previously-recorded voice annotation, sending some text to a text-to-speech synthesizer, opening a new window, or running a program, which might animate a picture or retrieve items from a database. A single document can have multiple scripts traversing it for different purposes, and a single script can traverse multiple documents to organize collections of information.
A script can be played back as a whole, in which case the cursor moves to the first location (l1) in a document and performs its associated action (a1). The document scrolls to display that location if the location does not currently appear on the screen, and the location is highlighted to call attention to it. After the associated time (t1), the cursor moves to the location specified in the next script entry (l2), performs its action, and so on. The same location in a document can appear at multiple points in the script, with the same or different associated actions and timing.
Another way to play a script is more user-directed. In this case, the timing information is ignored, and the script reader proceeds from one entry to the next at his or her own pace.
Arbitrary actions at a scripted location allow scripted documents to perform a wide variety of tasks: demonstrations, tutorials, etc. Parameterized actions allow a script to be personalized ("Hi <username>") or to more accurately reflect the current state of affairs ("There are <curnum> entries in this category"). For speech, this capability requires a text-to-speech synthesizer.
A script is written by a script writer, who may or may not be the same person who wrote the underlying document(s). A person who plays the script back is called a script reader.
Scripts can be very formal items that are carefully crafted for pedagogical reasons, like a videotape, a demonstration, or a presentation. Or they can be informal, used to communicate from a single script writer to a single script reader (who might well be the same person).
Example 1: formal script
....imagine a formal script for a presentation here...
Example 2: informal communication from a reviewer to an author
An author writes a document and submits it in electronic form to a reviewer. The reviewer makes vocal annotations about the document and collects these annotations into a script. The script might traverse the document in decreasing order of importance of the needed changes. Within the script, examples of a single problem could be collected together. For example, suppose that there are three places in the document where the author refers to linked lists, but he uses slightly different terminology at each place. The script might have an initial entry describing the overall problem of confusing terminology, immediately followed by entries at each of the confusing spots, with remarks about particular difficulties at each. In this way, the script approximates the reviewer sitting down with the author and a copy of the document, making comments and flipping back and forth through the document to substantiate these comments. The reviewer could make multiple scripts on different topics, such as positive comments and negative comments, or overall comments and specific typographic and spelling corrections. Comments from multiple reviewers could be represented as multiple scripts.
Example 3: informal organizing technique for writers
Scripts could also be used as an organizing technique during authoring, similar to 3x5 note cards. When analyzing a source document, the user could create multiple scripts on different topics. These scripts could be reordered as needed. The use of voice to collect the information could make the whole process very quick and painless.
Related Work
Different scripts through a single document can focus on different aspects of the subject matter. In this way, scripted documents are a form of hypertext [DelSch86], [DonWid86], [HalMorTri86]. The script approach has the positive aspect that the script reader need not bring as much knowledge and direction to viewing the document, but the negative aspect that the script reader can only traverse connections that have been prepared by the script writer.
Scripts are also similar to Knuth's WEB programming language [Knuth84], in that WEB programs can traverse their documentation in arbitrary ways to organize the material as desired. However, WEB programs have a single ordering in their human-readable form.
Scripted documents also bear a similarity to Feiner, Nagy and van Dam's interactive documents [FeiNagVan82]. Their system was primarily intended for the creation of high-quality interactive books, with high-quality images and animations. It was intended that the books be created by professional writers and illustrators. While formal scripted documents would certainly require care in their preparation, scripted documents also have a variety of more informal uses, as decribed earlier.
Creating and editing scripts
Abstractly, a script is a separate entity that refers to underlying document(s). However, in our prototype implementation, script entries are part of each scripted document, although the script must be activated separately. The document can be viewed without activating the script.
Although a script is part of its associated document(s), a separate Script Tool is used to create and edit scripts. The prototype Script Tool is a forms-based tool with a simple set of operations. To issue a command, the script reader or script writer fills in the fields appropriate to that command and then clicks a mouse button over the command name.
Script editing operations
These operations are treated as edits to the document in which the script entry appears.
Create script [scriptname]. This operation creates a new script, allowing the script writer to name the script with an arbitrary string.
Destroy script [scriptname]. This operation removes the named script from all associated documents.
Add entry [selection, scriptname, seqNum, action]. This operation adds the selected location to the named script after entry seqNum (that is, as entry number seqNum+1). The script writer fills in the action field as a textual sequence of commands. These commands can include calls to specially-written user procedures. The script tool records the filename with each script entry.
Delete entry [scriptname, seqNum]. This operation removes entry number seqNum from the script.
Script examination operations
Find entry [scriptname, seqNum]. This operation scrolls the document to entry number seqNum and highlights its location. It also displays the associated action in the tool's action field.
Next entry/Prev entry [scriptname, seqNum]. These operations allow the user to step through the named script easily in either direction. Next (Prev) entry is equivalent to incrementing (decrementing) seqNum and clicking Find entry.
List scripts [selection]. All script names that appear in the selected document are listed in the script tool's message area.
List entries [selection]. All <script name, sequence number> pairs that appear in the selected location are listed in the script tool's message area.
Script playback operations
Play script [scriptname]. This operation plays the named script, starting at the first entry and proceeding automatically in sequence through each entry to the end of the script.
Play entry [scriptname, seqNum]. This operation performs the action of entry number seqNum in the named script.
Stop. Cancels all script playback operations.
Visualizing scripts
Script visualization is a problem for both the script reader and the script writer, because script sequencing and script actions are not directly visible by examining the document. The script reader must be able to tell that a document has associated scripts and which scripts are appropriate for him or her to play back. The script writer has the more difficult problem, in that he or she must also have tools for debugging faulty scripts.
The script reader is alerted to the presence of scripts available to be played in a document in two ways. First, when a document is displayed, a
Script button appears in the document header if and only if the document has scripted locations. All script examination and playback operations are available from a popup menu generated by clicking the
Script button. Second, each scripted location is distinctively marked with a surrounding rectangle. This rectangle indicates membership in some script, but does not show either the number and identity of the scripts or the sequence number within those scripts. To discover this information, the script reader can use the
List entries function in the Script Tool. Similarly, to discover what scripts have entries in a given document, the script reader issues the
List scripts command.
Note: Suresh suggests that only the beginning of a script should be marked. His papers also mention/imply that the script reader will probably wish to know how long a script is (how long it will take to play back, analogous to scroll bar indicator) -- this also applies to annotations, currently can edit to get a visual representation of length. Also attaching annotations to specific portions of diagrams...
The script writer can use the preceding commands together with the Find entry, Next entry, and Prev entry commands to traverse the scripts and examine each action field. These functions are sufficient for creating and editing scripts, but they can be tedious, and they are certainly not evocative.
The Extract script command prints out a named script in textual form. It lists the script name. Beginning with the first entry in the script, it then lists the sequence number, the filename, the character indices of the start and end of the scripted location, and the action field for each entry in the script. This command records the create date of each underlying document in addition to its filename; it can only be applied if all underlying documents are in a saved state (that is, they have no pending unsaved edits).
The
Apply script command reads a textual script and applies it to a document (or set of documents), replacing any scripts with the given name. Using this command, a script can be created in textual form and applied to a document, or it can be used in conjunction with the
Extract script command to examine and edit action fields more easily, with global substitutions, etc. It also allows the script writer to create additional scripts that are slight modifications of old scripts. Scripted locations can be moved by using the Tioga editor's
Position command to calculate the indices of the new positions. The script writer must confirm application of the
Apply script command if any of the underlying documents has been edited since the script was written, because the document position indices may have changed.
Sample script
script name: Simple Introduction
sequence number: 1
filename: /ivy/pollez/voice/ScriptUserManual.tioga!1 15-SEP-86 14:00:27
location: 10..21
action: ← FinchSmarts.TextToSpeech[Rope.Concat["Hello, ", UserCredentials.Get[].name],TRUE,NIL]
action: speak "Hello, ", UserCredentials.Get[].name
time: *
sequence number: 2
filename:
location: 235..240
action: ← TiogaVoice.PlaySelectedAnnotation[]
action: play annotations
time: *
sequence number: 3
filename:
location:
action: ← FinchSmarts.PlaybackTune[RowRowRowYourBoat]
action: play RowRowRowYourBoat
time: *
sequence number: 4
filename:
location:
action: PreView Picture.interpress
specifying scrolling behavior for this one would be hard!!
time: *
sequence number: 5
location:
characters:
action:
time: *
Implementation
Scripted documents are implemented in the Cedar programming environment [SwiZelBeaHag86] in the Computer Science Laboratory of the Xerox Palo Alto Research Center. The two systems that they rely upon most heavily are the Tioga editor and the Etherphone system. We describe each of these systems briefly, and then we describe the implementation of scripted documents.
The Tioga editor
The Tioga editor is a what-you-see-is-what-you-get galley editor, used for both program text and high-quality documents. Tioga documents are tree-structured, where the writer constructs the tree to correspond to the structure of the document; a node may be a section head or a paragraph. Tioga documents can be displayed at any level of detail, from the single root node to the full tree.
In addition, Tioga documents can have rich formatting and typography. Each node has an associated format, which specifies such parameters as its leading, its margins, its default typeface, and so on. A document as a whole has an associated style, which defines the meanings of all formats used in that document. Nodes can also have arbitrary named properties, added by the user or by programs. These properties are not directly visible when the document is viewed, but a separate tool called the Edit Tool can be used to examine their values. One use of node properties is to specify bitmaps and other necessary parameters for pictures embedded in Tioga documents.
Individual characters in a Tioga document can have looks, which specify special typeface parameters such as bold face, italics, subscripts, and the like. Characters can also have arbitrary named properties. Tioga has search commands that allow documents to be searched rapidly for given node or character properties.
The Etherphone system
The experimental Etherphone system [SwiSteOrn83] uses Ethernet communications to transmit digitized voice. The system consists of microprocessor-based electronic telephones, a centralized switching server, a voice file server, and workstation programs to support voice communications and voice recording services. From a workstation, a user can place and receive telephone calls, maintain private telephone directories, and manage a database of voice messages. A voice annotation package allows voice to be added to Tioga documents and provides simple voice editing functions [AdeSwi86]. In addition, a commercial text-to-speech synthesizer exists as a server in the Etherphone network. The synthesizer allows the system to `speak' text, initiated either by the user (perhaps by selecting the text in a viewer) or by a program (such as speaking an error message or proofreading a document).
This work on scripted documents began as a project to exploit the capabilities of the Etherphone system by creating narrated documents. Narrated documents are scripted documents whose actions play previously-recorded voice annotations.
Scripted Documents
In our prototype implementation, script entries are part of each scripted document. A split representation of script entries allows scripted documents to be edited without constantly updating the character indices for a scripted location.
A saved document contains a list of its scripts in Apply/Extract script form as a node property ($Scripts) on its root node, with the addition that each script entry is given an internally-generated unique identifier. As an optimization, a saved document need only contain references to script entries within it, together with a filename for each Next or Prev entry outside that file. Those entries can be discovered and played back when needed by accessing their containing file. The beginning of a script is indicated by the script entry with sequence number 1.
A character that is a part of a scripted location has a character property ($ScriptEntries) that contains the script name and the script entry's unique identifier. If a character has more than one associated script entry, either because it appears in multiple scripts or because it appears more than once in a single script, then that character's $ScriptEntries property contains a list of these tuples. As mentioned above in the section on visualizing scripts, these characters also have properties that give them a unique appearance, with a small rectangular box surrounding them.
While a script is being edited, the Script Tool maintains an internal linked representation of the script, allowing for easy insertion and removal of script entries. Whenever the Tioga Save command is issued, character indices are computed for the script entries, to speed playback. For unsaved documents, the character indices are ignored, and character property search commands are used to find the character with the desired unique identifier. In fact, the system treats character indices at all times as only a hint: if the indicated indices do not have the matching unique identifier, the search commands will be used instead.
Using character properties means that the scripting system need not monitor all insertions and deletions of characters in unscripted locations. In fact, using character indices as only a hint means that the scripting system need not be running to edit or view a scripted document without playing back its scripts. However, changing and/or copying a scripted character still cause problems. (Recall that scripted characters have a unique appearance, so the user is alerted.) Changing a scripted character removes its character properties, leaving the script referring to a possibly non-existent unique identifier, and copying a scripted character creates two possibly non-contiguous copies of the same unique identifier. Flaws of this kind in a script are detected the next time that the document is saved (if the script system is active) or the script is played back, and the user is given a chance to repair the damage by repositioning the script entry in question.
The current implementation has the drawback that changing the number of script entries in a script that crosses document boundaries forces the system to edit other documents in that script to update their sequence numbers. We plan to relax this constraint by using the script entry naming scheme described below in the section on conditional scripts. Of course, scripts that cross document boundaries do present problems in ensuring that the documents are kept in a consistent state and are not deleted piecemeal. As an interim measure, we use the Cedar DF Tool system to maintain consistency [SwiZelBeaHag86], but a system-wide database for recording such interrelationships is under development [Terry86].
Future Work
We are working on extending our prototype to allow better visualization of scripts in situ and to allow conditional and interactive scripts. Another direction for future work lies in the more detailed control of screen and document layout at each script entry.
Improved visualization
The Extract script and Apply script commands are useful for examining and editing a script's action fields, but they do not adequately depict the script's sequencing information nor the document contents at the given indices. A promising idea for displaying a script in situ is to create a fisheye view [Furnas86] of its scripted locations, with sequence numbers added to the script rectangles, as shown below.
Header|#3 1A
...
Header 2A
... This is a sentence|#1.
...
Header 1B
...
Header 2B|#2.
... This is another sentence.
Even if this view does not fit entirely in one screen, it can be scrolled to present a much better overview of the sequencing and location information than viewing the entire document.
A more graphical approach would be to create thumbnail sketches with superimposed arrows depicting the script's path.
Conditional and interactive scripts
Conditional scripts would allow a single script to vary to suit the script reader's needs. For example, a script could take one path if the script reader's experience level is novice, another if intermediate, a third if expert. Interactive scripts would allow user interaction to gather parameters for conditional scripts, so the script reader can participate in the creation of a particular viewing path.
One way to extend the current implementation of scripts to permit conditional scripts would allow the script writer to specify the next field directly, as a sequence of <condition, script entry name> pairs. A script entry would be expanded to contain an optional corresponding name. Script entries that follow in unconditional sequence need not be named. For conditional scripts, the sequence number of an entry would no longer be well-defined, and we could use the well-known assembly language naming scheme of namedLocation, namedLocation+1, namedLocation+2, anotherNamedLocation, anotherNamedLocation+1, etc. The Write script command must list the entries in a reasonable order (e.g., some form of topological sort). In addition, the Prev entry command would be available only during script playback, at which time a history mechanism would be used to implement it.
We are also considering ways to specify conditional scripts by combining structured programming ideas and direct manipulation techniques.
For interactive scripts, we would like to provide mechanisms to help the script writer gather input from the script reader, such as popup choice buttons. These mechanisms are available as packages in the underlying Cedar system (and hence could be accessed as part of an entry's action), but it may prove useful to tailor them to ease the script writer's task as much as possible.
Improved layout control
At each script entry, the script writer may wish to control the appearance of the document, such as how many levels of the document's tree structure should be displayed, the manner in which the scripted location is highlighted (such as reverse video, underlining, or blinking). In addition, when a script crosses a document boundary, the script writer may wish to specify whether the new document is loaded into the existing window or a new window is created. Scripts can become arbitrarily complex as they try to specify the appearance of the result more fully, just as typesetting tools allow more or less control over the fine points of document layout (such as Scribe vs. TeX).
Conclusions
Scripted multi-media documents can contain any combination of text, pictures, audio, and action. Scripts need not follow the normal linear order of their associated document(s). In addition, the script writer can construct multiple viewing paths through the document(s) for different readers and purposes. This novel mechanism allows writers to communicate additional information to readers. Scripts can be used in a wide variety of ways, including: to construct formal demonstrations and presentations, to construct informal communications, and to organize collections of information.
Scripts can be a special kind of link in a hypermedia system.
References
[AdeSwi86] S. Ades and D. Swinehart. Voice annotation and editing in a workstation environment. Proceedings of AVIOS'86 Conference, Arlington VA, Sept 1986.
[DelSch86] N. Delisle and M. Schwartz. Neptune: a hypertext system for CAD applications. Proceedings of ACM SIGMOD'86 Conference, Washington DC, May 1986.
[DonWid86] J. Donahue and J. Widom. Whiteboards: A graphical database tool. ACM Trans. Office Information Systems, 4(1), Jan 1986.
[FeiNagVan82] S. Feiner, S. Nagy and A. van Dam. An experimental system for creating and presenting interactive graphical documents. ACM Trans. Graphics, 1(1), Jan 1982.
[Furnas86] G. Furnas. Generalized fisheye views. Proceedings of CHI'86 Conference, Boston MA, Apr 1986.
[HalMorTri86] F. Halasz, T. Moran, and R. Trigg. NoteCards in a nutshell. Submitted to the SIGCH+GI '87 Conference, Toronto, Ontario, Apr 1987.
[Knuth84] D. Knuth. Literate programming. Computer Journal, 27(2), May 1984.
[Meyr86] N. Meyrowitz. Intermedia: The architecture and construction of an object-oriented hypermedia system and applications framework. Proceedings of OOPSLA '86, Portland, OR, September 1986; SIGPLAN Notices 21(11), Nov 1986.
[SwiSteOrn83] D. Swinehart, L. Stewart, and S. Ornstein. Adding voice to an office computer network. Proceedings of IEEE GlobeCom'83, Nov 1983.
[SwiZelBeaHag86] D. Swinehart, P. Zellweger, R. Beach, and R. Hagmann. A structural view of the Cedar programming environment. ACM Trans. Prog. Lang. and Systems, 8(4), Oct 1986. (for now, see D. Swinehart, P. Zellweger, and R. Hagmann. The structure of Cedar. Proceedings of ACM SIGPLAN'85 Symposium on Languages Issues in Programming Environments, Seattle WA, July 1985; SIGPLAN Notices 20(7), Aug 1985.)
[Terry86] D. Terry. Voice ropes and their management. In preparation. Computer Science Laboratory, Xerox Palo Alto Research Center.
[YanMeyVan85] N. Yankelovich, N. Meyrowitz, and A. van Dam. Reading and writing the electronic book. Computer, Oct 1985.