CAMINOREAL CAMINOREAL CAMINOREAL EP88 DRAFT -- August 31, 1987 11:36:18 am PDT (BEACH) -- EP88 DRAFT EP88 DRAFT -- August 31, 1987 11:36:25 am PDT (BEACH) -- EP88 DRAFT EP88 DRAFT -- August 31, 1987 11:36:29 am PDT (BEACH) -- EP88 DRAFT CaminoReal: An Interactive Mathematical Notebook Dennis Arnon, Carl Waldspurger, Kevin McIsaac, and Richard Beach © Copyright 1987 Xerox Corporation. All rights reserved. Some style issues and conventions avoid capitalization unless proper name avoid quote marks; perfer to use look-o for examples computer algebra vs symbolic algebra name for the field Abstract: Four broad categories of Mathematical Software are Computer Algebra (Symbolic Mathematics), Numerical Computation, Mathematical Typesetting, and "Technical Electronic Mail" (i.e. e-mail that contains mathematical expressions). In each of these categories one finds powerful and sophisticated systems. Nonetheless, what one really would like is simultaneous, integrated access to all four types of functionality. CaminoReal is a system for integrated, interactive, technical documents and computations. It lives in Cedar, the programming environment of Xerox PARC's Computer Science Laboratory, and is used in conjunction with Tioga, Cedar's multimedia document editor. Printing and management of other document components, such as text and graphics, is provided by Tioga. For computation, CaminoReal offers a small builtin algebra package based on the notions of Domains and Objects, plus access to "Algebra Servers" on a network. Mathematical expressions are passed between CaminoReal, Tioga, and Algebra Servers in pure functional notation. Our current Algebra Servers are the Reduce, SMP, and SAC-2 Computer Algebra systems. CaminoReal is structured in accordance with a recently proposed standard architecture for Mathematical Systems. We summarize this architecture in this paper. This paper combines an exposition of CaminoReal as it currently is, with suggestions for desirable changes or extensions. The current version of CaminoReal has at least touched on most issues we consider to be within our scope, so we will typically describe the existing features, and then how they might be altered. Keywords: Computational Mathematics, Document Processing, Mathematical Typesetting, Technical Documents, Mathematics Editing, WYSIWYG, User Interfaces, Direct Manipulation, Computer Algebra, Symbolic Mathematical Computation, Object-Oriented Programming XEROX Xerox Corporation Palo Alto Research Center 3333 Coyote Hill Road Palo Alto, California 94304 DRAFT  Preliminary Version  DRAFT There is no 'royal road' to geometry. Euclid, said to Ptolemy I Preface for Cedar Users To see the mathematical expressions in this document (when reading it online or printing it), be sure you have a running CaminoReal. 1. Introduction vision of integrated system for mathematics examples of notebooks and technical papers four separate developments: symbolic algebra or computer algebra (Reduce, Macsyma, SMP, etc.) numerical computation (IMSL libraries) multimedia electronic mail (Diamond? who has math?) mathematical typesetting (eqn, TeX) Since the 1950's, many researchers have worked to realize the vision of natural and powerful computer systems for interactive mathematical work. Nowadays this vision can be expressed as the goal of an integrated system for symbolic, numerical, graphical, and documentational mathematical work. One aspect of this vision of particular interest to us is the idea of "living" notebooks and technical papers. We want a system that supports both the exploration of technical ideas through computations, and the production of an evolving technical document that describes them. Recently the development of personal computers (with high resolution screens, window systems, and mice), high-speed networks, electronic mail, and electronic publishing, have created a technological base that is more than adequate for the realization of these integrated systems. However, the growth of separate Mathematical Typesetting, Multimedia Electronic Mail, Numerical Computation, and Computer Algebra communities, each with its own standards, is tending to impede their realization. CaminoReal is prototype instance of an architecture: standard mathematical systems interchange protocol CaminoReal is a prototype system for the integration of documents, editing, and computation involving mathematics. Its design is an instance of an architecture for Standard Mathematical Systems that we recommend be adopted by others. The thrust of this architecture is to provide a standard means for combining diverse system components, namely communication by means of mathematical expressions expressed in a pure functional notation. CaminoReal in Cedar . . . CaminoReal lives in Cedar, the programming environment of Xerox PARC's Computer Science Laboratory, and is used in conjunction with Tioga, Cedar's multimedia document editor. Actual document production (e.g. printing, management of other document constituents such as text and graphics) is provided by Tioga. The screen, mouse actions and keyboard input is managed by the Cedar viewers package. A viewer is a window that can be scrolled and resized. A viewer can have buttons and pop-up menus that invoke commands. The mouse is used to point and select text or expressions. CaminoReal supports interactive, syntax-directed, two-dimensional, WYSIWYG editing of mathematical expressions, placing/fetching such expressions in/from Tioga documents, and algebraic manipulation of expressions. Algebraic computation can be performed using either a small builtin package, or using well-known algebra systems such as Reduce, SMP, and SAC-2 over a network. The internal algebra package is based on an object-oriented paradigm that supports polymorphic procedures. For example, one can easily create and do simple arithmetic on matrices of polynomials with complex number coefficients, or matrices of such matrices, etc. notion of interactive technical documents, or perhaps integrated documents, or even active electronic documents examples of extensions to spreadsheets and mathematical form letters or tools for proofreading mathematics notation CaminoReal supports the creation of "interactive" technical documents. For example, the user can browse a (nicely typeset) draft of a technical document on the workstation screen, select, edit and compute with mathematical expressions in it (besides editing text and graphics, of course), and insert the resulting expressions back into the document. One can extend this to the notion of a "computed document", i.e. a document with imbedded computations. Two particularly useful flavors of computed documents are spreadsheets and mathematical form letters. 5. Computation 5.3. Issues for future work on computation extension of CaminoReal to multiple algebra requests outstanding a potential solution outlined using strips or panes and mulitplexing requests to the algebra server (have we discussed an algebra server?) Re: implementation of algebra multiprocessing: Assume that CR windows have been improved so that there is one CR tool, and multiple "panes" or "strips", each containing an expr and having a biscroller (such strips could be implemented as separate viewers). The simplest reasonable rule is that when you request some algebra in a strip, that strip blocks until the algebra finishes, but you can go on processing (and perhaps requesting algebra) in the other strips. Assume that a dedicated stream is open throughout a CR session. There is a manager on the Unix end of this stream, who receives algebra requests from Cedar over it, and sends back algebra replies over it. A manager on the Cedar side receives these replies and forwards them to the appropriate strips. issue of multiple algebra processes keep them to a minimum (Unix process overhead, I guess) each process starts with clean state or environment results returned via stream or file depending on reliability constant: 30 secs We will permit multiple simultaneous algebra processes in Unix, but we want to keep the number to the minimum. Also, for simplicity, we'll follow the rule that each process starts with a "clean" algebra system (e.g. Reduce) state, although one can imagine supporting state maintenance in a future more sophisticated world. Operation is: Unix manager always has a free algebra process available (either starts a new one if necessary, or wipes clean the state of an existing one). When it gets an algebra request, gives it a lock on the algebra process for 30 seconds. Other requests that come in during this time are queued. If the current request finishes in less than 30 seconds, than the result is sent back over the stream, and we go to the request at the head of the queue. If a request doesn't finish in 30 seconds, then a file is set up to receive its output (so partial results won't be lost in the event of a crash), its priority is niced up so that it goes into the background, and the manager goes on to process the next request in the queue. When a background algebra request finishes, the Unix manager marshalls its output and sends it back over the stream to Cedar. issue of naming algebra system functions give job to user interface! It's irritating to have to remember the different names for the same command in different algebra systems. For example, the command to factor is Fac in SMP and Factorize in Reduce. Seems like the user interface should hide such details from the user. issue of exception reporting It would be nice if all algebra systems followed a common paradigm for exception reporting. E.g. there could be separate data and message streams when CaminoReal communicates with a remote algebra system. issue of user query to status of running algebra process Actually the message stream should be two-way, so e.g. the user can query the state of the algebra system while his remote command is running, or even change it (e.g. abort his command, which currently there's no way to do). issue of caching communication streams for conversation Speaking of streams, one way to do communications between CaminoReal and remote algebra systems would be to open data and message streams the first time a user talks to a particular algebra system, and then keep those streams, as well as the remote algebra process, open as long as CaminoReal is alive. Presumably this would cut the overhead of a remote algebra command from 10 seconds to nil. The Cedar Bridge package can apparently be tweaked to do this; I'm unclear about the tradeoffs of this paradigm vs. RPC. 6. Computed Documents 6.1 Introduction with assignment we get access to environment with evaluation we get access to computation sequence of math objects in Tioga documents gives us compuational documents mathematics can be assigned to environment and computed on-the-fly two uses immediately obvious: spreadsheets or mathematical form letters mathematical proofreading CaminoReal's expression language supports an assignment statement. Also, expressions assigned to variables are maintained in a symbol table (the Environment). If desired, the math expressions imbedded in a Tioga document can be evaluated prior to being painted, whenever the document is displayed, and so can be defined as functions of other expressions. This makes possible Tioga documents that are spreadsheets or mathematical form letters, or simply computed documents. Having the math in a technical paper computed on the fly minimizes the introduction of typographical errors, and so reduces the burden of proofreading. 6.2. MathEval on/off global flag to enable computation before painting Tioga math object When the EvalBeforePaint flag is off (the default), CaminoReal Expressions in a Tioga document are painted just as they are stored. When the EvalBeforePaint flag is on, they are evaluated before being painted. This enables Computed Documents. The Cedar CommandTool commands "MathEval on" and "MathEval off" do the toggling. 6.3 SpreadSheets What is a SpreadSheet The clasical spreadsheet as typified by Lotus-1-2-3 or Excel has the following major notions. Cells A cell has knows four things. i) Its current value. We will refer to this as its value ii) How to calculate its value. We will refer to this as its formular. This may be a constant such as $100 or an expression involving other values of other cells. iii) How to format the current value. We will refer to this as its format. For example right justified. iv) Its name. Rectangular layout Cells are arranged in a rectangular grid. The columns and row are indexed by numbers or integers. Conventional the columns are a-z and the rows are the integers starting at 1. A cell is refered to a by its row and column Manipulation One is able to alter the formular and and format of a cell, not its name or its value except through the formular. A cell is refered to either by entering its name in a command window or by pointing to the cell and selecting with the mouse. The mouse selection enters the name of the cell into the command window. Suppose we wish to give cell B2 the formular A2 + B1. We could type the command "B2 = A2+B1" into the command window. Alternatly we can mouse select the cell B2,type "=" from the keyboard, mouse select the cell A2, type "=" from the keyboard, mouse select the cell B1. The use of mouse selection removes the burden of naming the cell and makes the process more intuitive. This will become more apparent when we progress to algebra where one becomes confused between equation numbers and symbols. It is important to note that we do not specify in which order the cells are to be calculated. In fact this is of centeral importants. Often one can take a formular from one cell and copy it to another cell. The references to other cells may or may not be changed. For example if we copy the formulae for B2 into B3 we might want he gemotric relationships in the formulae to be preserved. That is the formulae reads "B3 = A3+B2". Is is an interesting notion available in most spreadsheets. An other nice feature is to sum, or perform someother operation, over a row, column or block with out listing all the elements. Evaluation The from formulae for the individule cells we can produce a Directed Graph for the evaluation of the spreadsheet. Lets restrict the spreadsheet to Acyclic Directed Graphs, we evaluate the cells in the obvious way by starting at the known cells and working back. If we alter a cell the DAG indicates exactly which cells mut be updated. The spreadsheet evaluation can be nicly described in an object oriented way. We ask a cell to evaluate itself. If the cell depends on other cells they are passed the message to evaluate itself. To display a cell we ask the cells to format their data. We can relax the Acyclic condition by applying a termination condition to the cycle. This could be a number of cycles, in which case we have a natural method for itteration, or some condition to be met by a cell, in which case we have a natural method for conditional loops. A nice application of the cyclic graph might be newton itteration. The cycle is preformed only once. giving an itteration to the correct solution. If the user select one of the cells and asks it to evaluate itself the the cycle is repeated once giving the next itteration. This continues as long as the user desires. One condition we have swept under the carpet is that the graph is assumed to be static. That is the formular can't evaluate to a cell name. This seems reasonable but further consideration may show useful interperations for this such as branching. This is an area that needs more thought. Extensions Free Form Layout The first extension is to remove the table like format of the spreadsheet. A cell can be placed anywhere on the screen. There is still a dependancy graph for the spreadsheet but the naming convention must change. A solution is to give the cell a name that is created when the cell is created. Any technique used to lable nodes in a graph would suffice. Note the use of pointing becomes much more important. Clearly the copy formulae and preserve geometric relationship is no longer valid! How ever other notions may take its place. Algebra. Instead of just numerical evaluation allow general algebra. 6.4 Audit Trails The basic audit trail scheme in relation to documents is that you maintain an audit trail as a DAG among mathematical expressions, as a separate data structure from a (Tioga) document in which the expressions may actually reside. There are various unresolved issues about the interaction of these structures, e.g. what happens to the screen display of an expression if you delete the node corresponding to it in the audit trail? The audit trail DAG, i.e. assuming that your space of algebra expressions is managed as an audit trail DAG, provides a basic discipline for the Cedar user interface for algebra multiprocessing, by thinking of it as a database and using standard DB transaction management. Read and write locks on DAG nodes are available; a read lock locks out writers but not other readers, a write lock locks out everyone. When you request algebra on a DAG node, the appropriate kind of lock is put on it and all other affected nodes in the DAG (e.g. a request to recompute a leaf from scratch will write lock all its ancestors). That request is then forked; you can now move to another node in the DAG and do anything consistent with the lock it has at that time (e.g. factor it, which requires read access, which is available unless it has a write lock on it). It will usually be meaningful and useful to have system-generated "line numbers", i.e. names for expressions, as distinct from mathematical variables. The names are your hooks to previous results for construction of future expressions. Typical loop is - you have a current node, containing an expr. You pop up a temp window with a copy of that expr, either by value or by reference. You do some "edits" on it, someone listens and records them, when you're done, the temp window goes away, the result goes back into the "document" as a new node with a transaction number, and the arc from the previous node to it is labeled with the actions that bring about that transition. The resulting structure is a (directed) tree. Obviously many different outcomes are possible if you alter some expression and replay portions of the audit trail, depending e.g. on whether your copies were by value or reference. 6.5. MultiDocuments 1. There is a database, different scripts can print different "readouts" of it. Thus e.g. I can store my working notes along with the paper for publication, but print out either or both. 7. Acknowledgements Thanks to Rick Beach and Michael Plass for enlightenment on Tioga and other aspects of Cedar. Thanks to Ken Pier for the "point and stuff" part of the CaminoReal-Tioga interface. Thanks to Christian LeCocq for the BridgeSubmit package that enables communication with the UNIX world. Thanks to those in the Computer Science Lab at PARC who have used CaminoReal for their documents and computations, and suggested improvements (what was the collected experience so far? perhaps an additional section). Thanks to Rick Beach, Alan Perlis, and Alan Demers for inspiration. References <> Bell Telephone Laboratories, "The Preparation and Typing of Mathematical Manuscripts", Third Revised Edition, 1979. Knuth, Donald, "Mathematical Typography", Bull. AMS (New Series), March 1979, V. 1, No. 2, 337-372. Palais, Richard, Column on Technical Wordprocessing, Notices of the AMS, ongoing. Swanson, E., "Mathematics into Type: Copying, Editing, and Proofreading of Mathematics for Editorial Assistants and Authors", American Mathematical Society, Revised Edition, 1979. <> Association of American Publishers, Electronic Manuscript Series, "Markup of Mathematical Formulas", 1985. Kernighan, B.L., Cherry, L, "A system for typesetting mathematics", CACM, 18 (March 1975), 151-157. Knuth, D.E., "The TeXBook", Addison-Wesley, 1984. <> Foderaro, J.K., "The Design of a Language for Algebraic Computation Systems", Report No. UCB/CSD 83/160, Computer Science Division (EECS), UC Berkeley, August 1983, 81pp. (Ph.D. Thesis) Jenks, R., "A language for computational algebra", Proc. ACM 1981 Symposium on Symbolic and Algebraic Computation, Snowbird, Utah, Aug 5-7, 1981, pp. 6-13. Report RC8930, Math. Science Dept., IBM TJ Watson Research Center, July 14, 1981. Martin, William, "Symbolic Mathematical Laboratory", Ph.D. thesis, MIT, Jan. 1967. <> Anderson, Richard, "Computer Recognition of Hand-Drawn Math" (not quite right), Harvard Ph.D. thesis, 1965? Martin, William, "Symbolic Mathematical Laboratory", Ph.D. thesis, MIT, Jan. 1967. Martin, William, "Computer input/output of mathematical expressions.", Proc. Second Symp. on Symbolic and Algebraic Manipulation (SIGSAM '71), ACM, pp. 78-89. <> Knuth, D., "The TeXBook", Addison-Wesley, 1984. M. Spivak, "The Joy of TeX", Addison-Wesley, 1986. <> Fenichel, An online system for mathematics (?), Harvard Ph.D., 60's. Hearn, A., The Personal Algebra Machine, Proc. IFIP '80, North-Holland, Amsterdam, 1980, pp. 620-628. R. Pavelle, M. Rothstein, and J. Fitch, "Computer Algebra", Scientific American, 245, 6 (December 1981), pp. 136-152. S. Watt, Parallel algorithms for computer algebra, Ph.D., University of Waterloo, 1984 <> Dongarra, J., and Grosse, E., "Distribution of Mathematical Software via Electronic Mail", Comm. ACM, 30,5 (May 1987), pp. 403-407. <> Hewlett-Packard HP-28C Reference Manual Report on the Interset 2000 System, Seybold Report on Publishing Systems, February 2, 1987, pp. 1-18. <> Swinehart, D.C., Zellweger, P.T., Beach, R.J., Hagmann, R.B., "A Structural View of the Cedar Programming Environment", Report CSL-86-1, Xerox Palo Alto Research Center, June 1986, 74pp., also ACM Trans. on Programming Lang. and Systems (TOPLAS), 1986. <> Bobrow D. et al, "CommonLoops: Merging Lisp and Object-Oriented Programming", OOPSLA Proceedings, 1986. M. Stefik and D. Bobrow, "Object-oriented programming: themes and variations", AI Magazine, VI, 4, Winter 1986, pp. 40-62. <> S. Card and T. Moran, "User technology: from pointing to pondering", ACM Conf. on Personal Workstations, 1986, pp. 183-197 Shneiderman, B., "The Future of Interactive Systems and the Emergence of Dirct Manipulation", Behav. Inf. Technol. 1, 2 (1982), 237-256. Furnas, G., "Generalized Fisheye Views", "Human Factors in Computing Systems", CHI-86 Conference Proceedings, ACM, 1986, 16-23. <> Calmet, J. and Lugiez, D., "A Knowledge-Based System for Computer Algebra", ACM SIGSAM Bulletin, V. 21, No. 1, Issue #79, pp. 7-13. Bloomberg, D. and Hogg, T., "Engineering/Scientific Workstation Project", Internal Report GSL-87-01, P87-00001, Xerox Palo Alto Research Center, January 1987. Klerer, M. and Reinfelds, J., "Interactive Systems for Experimental Applied Mathematics", Academic Press, New York, 1968, 472 pp Martin, William, "Symbolic Mathematical Laboratory", Ph.D. thesis, MIT, Jan. 1967. PC Magazine, The Scientific PC: Software for Problem Solving", April 14, 1987, pp. 155ff. Wells, M. B. and Morris, J. B. (eds.), Proceedings of a Symposium on Two-Dimensional Man-Machine Communication, ACM SIGPLAN notices, Vol 7, No 10, October 1972. <> MathLab Group, "Macsyma Reference Manual", Version 9, Laboratory for Computer Science, MIT, December 1977, Chapter 3. <> Abdali, S.K., Cherry, G.W., Soiffer, N., "An Object-Oriented Approach to Algebra System Design", Proc. 1986 Symp. Symbolic and Algebraic Computation (B. Char, ed.), ACM, pp. 24-30. A. Fortenbacher et al, "An Overview of the Scratchpad Language and System", Document Number Pre-Release V0M11, Mathematical Sciences Department, Knowledge Systems, Computer Algebar group, IBM TJ Watson Research Center, April 1987, 116pp. Soiffer, N., "A Perplexed User's Guide to Andante", MS, UC Berkeley, 12+1 pp, November, 1981. <> Abdali, S.K., Cherry, G.W., Soiffer, N., "On the Road to Better Computer Algebra System Interfaces", TR #CR-87-26, Computer Research Laboratory, Tektronix Laboratories, Beaverton OR, March 1987, 10pp. Foderaro, J.K., "Typesetting MACSYMA Equations", in Proc. of the 1979 MACSYMA Users Conf, V.E. Lewis (ed), Washington DC 345-361, also, UCB MS Project Rpt. EECS Dept. 1978. Fateman, R., "TeX Output from Macsyma-like systems", MS, 5pp, University of California, Berkeley, May 1987. Foster, G., "User interface considerations for algebraic manipulation systems", Report No. UCB/CSD 84/192, Computer Science Division (EECS), University of California, Berkeley, June 1984. Foster, G., "DREAMS: Display REpresentation for Algebraic Manipulation Systems", Report No. UCB/CSD 84/193, Computer Science Division (EECS), University of California, Berkeley, April 1984. Leong, B. "Iris: Design of a User Interface Program for Symbolic Algebra", Proc. 1986 Symp. Symbolic and Algebraic Computation (B. Char, ed.), ACM, pp. 1-6. C.J. Smith and N. Soiffer, "MathScribe: A User Interface for Computer Algebra Systems", Proc. 1986 Symp. Symbolic and Algebraic Computation (B. Char, ed.), ACM, pp. 7-12. <> Kimball, R., "Formula User Interface Issues", Internal memo, Xerox PARC, March 8, 1978. McGregor, S., "Desktop Formula Frames Implementation", Xerox Office Products Division Internal Memo, November 1978, 13pp. McGregor, S., "Star Formula Implementation", Xerox Office Products Division Internal Memo, November 1978, 3pp. McGregor, S., "Tasks for Implementing Formulae in Star", Xerox Office Products Division Internal Memo, August 1980, 4pp. Quint, V., "An interactive system for Mathematical Text Processing", Technology and Science of Informatics, V. 2, #3, (1983), pp. 169-179. Quint, V., "Interactive Editing of Mathematics", Proc. First International Conference on Text Processing Systems, 24-26 October 1984, Dublin, Ireland, Boole Press, Dublin, 1984, pp. 55-68. Schelter, W.F., "Sample INFOR Display", MS, Department of Mathematics, University of Texas-Austin, August 1986, 11pp. <> G. Culler, "Mathematical laboratories: a new tool for the physical and social sciences", ACM Conf. on Personal Workstations, 1986, pp. 59-72, reprinted from Klerer and Reinfelds 1968 (op. cit.) Rice, J. and Rosen, S., "NAPSS, Numerical Analysis and Problem Solving System", Proc. ACM 21st National Conference, Los Angeles, 1966, ACM Publication P-66, (1966), p. 51ff. EP88 Outline 1. Introduction vision of an integrated system for mathematics and text extend mathematical content within documents to incorporate the math semantics, not just typography; opportunity to compute with the math, not just look at it examples of notebooks and technical papers with live mathematics integrate four separate developments: symbolic algebra or computer algebra (Reduce, Macsyma, SMP, etc.) numerical computation (Linpack library, IMSL libraries) multimedia electronic mail (Diamond and Andrew) mathematical typesetting (eqn, TeX) CaminoReal is prototype instance of an architecture: standard mathematical systems incorporates math notation editor, displayer, computation, and document formatting system in what ways does CaminoReal violate these concepts? interchange protocol among components in math systems CaminoReal in Cedar . . . briefly explain how Cedar provides foundation for CaminoReal reference integration paper (Donahue) notion of interactive technical documents based on computation with mathematical semantics or perhaps integrated documents, or even active electronic documents examples of extensions to spreadsheets and mathematical form letters or tools for proofreading mathematics notation 2. Standard Mathematical Systems - a Proposed Architecture 2.1 Introduction postulate abstract syntax for any mathematical expression operator and arguments <> 2.2 Standard Mathematical Components A standard mathematical component (abbreviated SMC) is a collection of software and hardware modules, with a single function, which if it reads mathematical expressions, reads them as abstract syntax, and if it writes mathematical expressions, writes them as abstract syntax. A standard mathematical system (abbreviated SMS) is a collection of SMC's which are used together, and which communicate with each other in abstract syntax. We may identify at least four possible types of components in an SMS. These are an editor, display, computation, and document components. Any particular SMS may have zero, one, or several instances of each component type. The connection between two particular components of an SMS, of whatever type, is via abstract syntax passed over a `wire' joining them. 2.2.1 ED - Math Editors tool to edit the abstract syntax representation of mathematics typical expression editors may edit visually or in the abstract syntax directly goal is to retain the mathematical semantics many existing mathematical notation editors do not qualify (e.g. Viewpoint math!) The math editor component edits abstract syntax into abstract syntax. Many math and expression editors already exist that work with other representations, for example, bitmaps or various formatting languages. These editors do not qualify as a math editor component of an SMS because they do not integrate with any other components through an abstract syntax representation. An ED component might permit edits to the abstract syntax directly or might provide a WYSIWYG visual interface to the mathematical notation. In either case, the goal of an ED is to retain the mathematical semantics. 2.2.2 DISP - Math Displayers displayers render an abstract syntax to make it human comprehensible in traditional appearance of mathematical notation combination of hardware and software many provide translation to existing math formatters, e.g. TeX or eqn value of interactive positioning permits selection of subexpressions for use by other parts of mathematical system incremental feedback show user what the system interprets the keystrokes as A DISP component renders an abstract syntax to make it human comprehensible in the style of traditional mathematical notation. These components are suites of software and hardware that take in an expression in abstract syntax and render it. An example might be an abstract-syntax-to-TeX translator, the TeX formatter itself, and a printer with associated drivers. While one might imagine many advantages of the translation of abstract syntax into the command sequences for an existing math formatter like TeX or eqn, an interactive DISP might be implemented independently. A DISP component may or may not support the notion of pointing at an expression to highlight or select a portion of a mathematical expression. If selection is supported, then the DISP component must be capable of extracting the selected subexpression in abstract syntax. An example of indicating subexpression selection within the expression expr might be Child[Child[expr,2],3]. 2.2.3 COMP - Computation systems computation done on both symbolic and numeric representation of mathematics encompasses traditional symbolic algebra or computer algebra systems as a component of an SMS, it must accept the abstract syntax representation and produce results in abstract syntax as part of an interactive system, there is the problem of maintaining state over a session or a document; two issues include: binding of values or expressions to names setting of global flags that control computation or selection of algorithms Computation components of an SMS perform algebraic or numerical computation on a mathematical expression. Examples of numerical libraries include the IMSL math and statistical library, LINPACK and EISPACK subroutine libraries. Examples of symbolic algebra systems include MACSYMA, REDUCE, and SMP. As a component of an SMS, a COMP must be prepared to accept an abstract syntax representation and to produce a result also expressed in abstract syntax. There are a number of challenging issues facing a system designer, such as how to standardize the naming of operators in abstract syntax, how to maintain state over a time-varying session (the binding of values to names in a computational environment), and how to establish global parameters and flags that control the computation or selection of algorithms. 2.2.4 DOC - Document systems document preparation systems that incorporate mathematical notation electronic mail should be considered in this category, mail a document, eh? various names for the system component user interface a nontrivial problem The DOC component is what you would typically call a document editor, document formatting system, or electronic mail system, assuming that they could handle the presentation of mathematical notation. However, we are interested in the extension of these capabilities to the handling of the semantics of mathematical expressions. A DOC component in a standard mathematical system must retain the abstract syntax representation while it formats the notation in a traditional style for printing or display, in order to preserve the ability to edit or compute with the mathematical expressions contained within the document. 2.3 Standard Mathematical Systems Example of different configureations Caminoreal, close coupled ED and DISP and COMP MathScribe, close coupled ED and DISP MACSYMA, SMP, close coupled ED and DISP and COMP Andrew, close coupled ED DISP and DOC configuring a standard mathematical system typical SMS: an ED and a DISP more integration permitted to support incremental update of expressions from edits desirable to have program interface (abstract interface) so ED would accept edit functions in abstract syntax similarily DISP displays abstract syntax and possibly selects subexpression boundaries between components are not hard and fast closely coupled ED and a DISP boundaries between the component types are not hard and fast Given this division of standard mathematical systems into four major components, there are many choices for configuring systems for integrating mathematics into documents. Systems might integrate selected components using abstract syntax representations, while other components are either ignored or utilized through alternative means. The level of integration among components may be quite varied and the boundaries between components might not be hard and fast. For example, an ED and a DISP component may be more closely coupled than suggested by exchanging abstract syntax representations. The ED internal representation of the abstract syntax and the DISP internal tree-of-boxes representation may have pointers back-and-forth to facilitate incremental updates or may share a common data structure. Similarily, an ED might support simple computations, such as simplification or rearrangement of subexpressions, or a DOC might contain a facility to display mathematical expressions. Nonetheless, a desirable architectural goal would be the capability for each component to receive a standard abstract syntax representation of an expression along with operations also expressed in the abstract syntax and to produce its result in an abstract syntax representation. Computer algebra developments, such as MACSYMA and SMP, have concentrated on integrating the ED, DISP and COMP components, while the document formatting functionality is provided through unidirectional translators. Integrated multimedia document formatting systems, such as Andrew, have concentrated on combining the ED, DISP and DOC components, completely separate from a mathematical computation capability (is this true?). Our research prototype, CaminoReal, attempts to combine all four components, ED, DISP, COMP and DOC, at the same time. 2.4 Miscellaneous notes: (some issues unresolved) issue of standard names for functions across various COMP e.g. translation or modification of names accepable to system do we need to introduce the concept of an algebra server(GLUE) which buffers the ED/DISP from the COMP. providing name translation and routing to COMP issue of mathematical objects within documents retaining sufficient semantics, by retiaining abstract syntax; additional representations permitted; printing information may be kept too issue of formatting mathematical notation supplied templates may not be sufficient for new notation or unanticipated appearance problems DISP may provide these extensions; unclear how abstract syntax is decorated with private formatting hints must be willing to render abstract syntax that it does not understand! print in a standard form, eg Sum[Pow[i,n],i,0,Inf] issue of formatting style for mathematics notation example of in-line equations with solidus or fraction bar, summation limits above/below or as superscript/subscript to sum operator opportunity for automatic logging Done through the computation model (spreadsheet) At this time, it is appropriate to raise a number of issues that remain unresolved about the architecture and design of standard mathematical systems. The naming of operations must be standardized. Presently, computer algebra systems use widely varying names. Therefore, a standard set of names must be adopted and translators provided for existing systems. Mathematical objects within document systems must retain sufficient semantics. At present, mathematical formatters represent only the appearance, while a DOC component must be capable of accepting and retaining the abstract syntax representation of a mathematical expression. Additional representations would be permitted, especially to incorpoate formatting information and style choices. A DISP must be prepared to cope with abstract syntax that may not be described by the local formatting rules and to display such expressions in some standard recognizable fashion. The formatting rules should be extensible so that new notation templates may be provided or existing templates overridden. These extensions might be expressed in abstract syntax and might accompany the expression to be rendered. 3. CaminoReal as a prototype SMS goal was to build a prototype of an mathematics sytem integrated into a document processing system built in the Cedar environment which encourages such architectures major components of CaminoReal Meddle -- an interactive mathematics editor (ED and DISP) AlgebraStructures -- an exploratory domain-based algebra system (COMP) algebra server -- access to existing algebra systems via distributed network protocols (COMP) Tioga -- a document formatting system with content extensions (DOC) major goals . . . interactive editing of math notation, exploring user interface issues exploring domain-based algebra systems . . . integrating formatted mathematics with semantics active documents as logs of sequence of thoughts, living documents, computed documents CaminoReal is a prototype of a standard mathematical system, where one could edit the mathematical notation interactively, display mathematics in a traditional two-dimensional appearance, compute symbolically or numerically with the mathematical expressions, and format and organize documents containing such mathematical objects. There are three major components of CaminoReal: Meddle, an interactive mathematics expression editor (combining the functions of an ED and DISP), AlgebraStructures, an exploratory domain-based algebra system (COMP) with remote access to other algebra services via distributed network protocols, and Tioga, the integrated multimedia document production system (DOC). We believe that such a prototype would reveal interesting architectural issues, many of which were described in the previous section, and would provide opportunities for several new directions for active documents described later in this paper. CaminoReal also provides a testbed for exploring user interface issues concerning manipulating mathematics within documents. 3.1. The Expression Editor expression editor combines ED and DISP for interactive manipulation of 2-dimensional math notation interactive expression editor modeled after Tioga document editor multiple selections, input actions from keyboard or menus input processing through incremental pseudo-parser incremental updates visible on display The expression editor in CaminoReal combines the math editor and display components of a standard mathematical system to manipulate two-dimensional math notation interactively. The editor was modeled after the Tioga document editor in Cedar, utilizing multiple selections, accepting input actions from either the keyboard or menu buttons, and processing input actions through an incremental pseudo-parser. The editor updates the display incrementally after each edit is performed. interesting issue of character codes CaminoReal finesses the challenge for other mathematical expression editors of presenting the rich collection of multilingual and technical symbols by utilizing the Xerox character code standard. This standard assigns a unique code, possibly a 1-, 2- or 3-byte code to each symbol and subsumes several international standards, such as ISO and JIS, as well as de facto standards, such as the AMS TeX math symbols. Fonts that conform to the Xerox character code standard will have mathematical symbols in standard code positions. Should a font lack particular symbols, a backstop ``kitchen sink'' font with a glyph for every code may be automatically substituted. Otherwise, the scalable font algorithms in the Cedar Imager ensure that symbols will appear on the display or in the document with appropriate size and shape information. expression editing paradigm expressions are trees, with operators as interior nodes and atoms at the leaves select subexpressions as subtrees; cannot select operators; selections in any CaminoReal tool or object The editing paradigm is based on expression trees, with operators as interior nodes and atoms as leaves. Each operator is represented by a template that defines the notation for that operator by determining the symbol representing the operator and the structure and appearance of the subexpressions. For example, the summation template defines that a Greek sigma, for the operator and that there be three subexpressions: a lower bound, an upper bound and a summand. Such templates are written in the Cedar language and registered to CaminoReal. When templates are first presented, any unfilled expressions are represented by placeholders (X). Thus, the addition template first appears as X. One may select subexpressions corresponding to subtrees by pointing at the relevant object. This strategy helps ensure that structurally correct expressions are created at the expense of freedom to manipulate individual operators. The intention is that additional operator templates be added to accommodate new notation rather than permitting the editor to arbitrarily rearrange subexpressions. To facilitate keyboard entry, a complete set of navigational commands are provided, such as select parent, sibling or child of the current selection. CaminoReal depends on four types of selections: primary, copy, move, and the keyboard caret. The primary selection, made by pointing and clicking with the mouse, establishes the focus of most operations. The copy selection, made by chording the keyboard SHIFT key and clicking the mouse, supplies the argument for copying a subexpression to replace the primary selection. A move selection, made by chording the keyboard CTRL key and clicking the mouse, supplies the argument for moving a subexpression to replace the primary selection. The keyboard caret identifies the expression most recently entered by a keyboard input action and determines where the next keyboard input will be directed. All editing operations are available from menu buttons and several common ones are mapped onto keyboard keys. CaminoReal uses a pseudo-parser in the Cedar user interface management system to map key presses into editing operations. This provides a feel similar to an operator precedence grammar, but is unfortunately not very natural. For example, the keyboard sequence x^2+1=0 requires the embedded `select parent' operations to get the proper associations for the expression X . Modifications to the primary selection may occur either by a replace operation or a wrap operation. In a replace operation, the subexpression in the primary selection is deleted and a new subexpression, determined by either a keyboard input action or a menu choice, is inserted in the expression tree. In a wrap operation, the primary selection appears as one of the placeholders in the new subexpression. For example, consider the first placeholder in the addition expression X as the primary selection. Typing the letter `r' replaces the first placeholder with X resulting in the new expression X , and subsequently typing the key `^' wraps a superscript template around the X to produce X (replacing the X with a subscript template would produce X ). The math expression editor provides several `creature comforts' suited to interactive editing of complex notation, especially the undo command and various scaling operations to improve the readability on low resolution display screens. selections selections primary (first, made when pointing), secondary (second, made after primary by chording keys), move (pending delete), caret (keyboard type-in) selection operations through mouse positioning or navigation operations (select parent, select child); pointing finds smallest bounding box operations on selections copy (s->p), move (s->p, delete s), swap (exchange p<->s) invoke operator through input actions (keyboard or mouse click on menu) templates (what are they) define notation (extensible representation through registering notation and composing procs) placeholders (X) identify null expressions yet to be supplied (enter + character and get X) enter expressions through replace and wrap replace primary selection with new object entered from keyboard or chosen from menu example: select first placeholder in X and type letter r to get X wrap primary selection by deleting selection and replacing by template example: select X and invoke replace with a^b template to get X, which wraps a superscript template around X, (replace would produce X) [[input parser for operator precedence, but not very natural]] keyboard entry of expressions requires navigating among expression tree: enter x^2+1=0 to get X, where is needed to select the parent of the most recent subexpression to wrap the + operation enter ($x for prefix operators like there exists to get X editing expressions supplying type of object keyboard input creature comforts of an interactive tool undo! scaling of viewer with scroll bars ISSUES: input parsing, pseudo-infix, how far can you go with autowrap? ISSUES: conversions between types ISSUES: selecting operators or only arguments 3.2. The Algebra Package CaminoReal provides two methods for algebraic computation, both integrated through the same tool interface: 1) an experimental domain-oriented package, AlgebraStructures, locally implemented in Cedar, and 2) several traditional algebra packages accessed as algebra servers over the network. One may evaluate expressions at several levels of detail, from selected subexpressions to complete expressions to entire documents. The results of evaluating an expression may be alternatively done in place or presented in a new CaminoReal tool. The integration of this computation capability and the interactive editing environment, especially utilizing the undo command, provides a very handy tool for exploring complicated expressions. The Cedar AlgebraStructures package supports a limited set of algebraic operations defined within a mathematical domain. Domains include expressions, variables, integers, booleans, rationals, reals and complex numbers. Domain structures include sets, families of sets, sequences, vectors, matrices and polynomials. For example, one could create a matrix of polynomials on a sequence of variables of rational numbers. Evaluation of expressions produces a result that is consistent with the domains within the expression. CaminoReal provides two methods for algebraic computation, both integrated through the same CaminoReal tool interface: -- an experimental algebra package within Cedar -- remote access to algebra servers on the network evaluation choices available in the tool to integrate computation also global flag for "evaluate before render" evaluation result replaces selection or presented in a new tool selection might be either in tool or document (does implicit copy from, eval, copy to operations) undo operations, both in tool and document editor, are very handy for exploring complex expressions! AlgebraStructures package explores a domain-oriented architecture (?) supports domains for expressions, variables, integers, booleans, rationals, reals, complex numbers domain structures include set, family of sets, sequences, vectors, matrices, polynomials therefore, operate upon matrix of matrix of polynomials of sequence of variables of rationals <> operation choices are presented appropriate for the domain in selection or working domain (?) evaluation of mathematical expressions to some domain result environment is global to CaminoReal (across all documents and tool instances) introduce name binding operator, X necessary metafunctions to kill values in environment and to quote or unevaluate expressions, X Remote access to algebra servers on network similar paradigm to local computation: select expression and evaluate, with result replacing selection requires converting selection into abstract syntax appropriate for algebra system example of integral: X evaluating with AlgebraStructures only returns a canonical form: X Converting the original integral into abstract syntax will give: Int[Div[Minus[1, Mult[2, Pow[x, 3]]], Pow[( Plus[1, Pow[x, 3]] ), 2]], x] while converting it to Reduce form will give: int(quotient(difference(1, times(2, expt(x, 3))), expt(( plus(1, expt(x, 3)) ), 2)), x) If we send this off to Reduce, after about ten seconds we get back X in our CaminoReal viewer. 3.3. The Integrated Document Formatter desire is to integrate mathematical content into document as a document object CaminoReal implementation limited by Tioga to creating passive artwork objects artwork objects require procedures to return a bounding box for layout, to paint in an imaging context (device independent, might be display or an Interpress master) objects may be in-line expressions or displayed notation DOC component relies on DISP component to format notation (true?) strategy is to have several CaminoReal tools for editing and manipulating mathematical objects tool operations to manipulate math objects in Tioga documents send math expression to document extract math expression from document evaluate math expression in document 3.1. The Basics of Editing Expressions basic requirements: interactive expression editor should be integrated into document editor, presently separate tool single incremental parser for editing and computing expressions single across editor and algebra system incremental to update display after each keystroke complication by insertion of templates along with ordinary parsing editor is tree insersion and replacement based, rather that an operator precedence parser <> CaminoReal provides an interactive expression editor through an editing tool interface. The user interface paradigm follows the Tioga document editor quite closely, with multiple selections made via a mouse pointing device, and input actions from both the keyboard and mouse activation of menu commands that operate upon the selections. These input actions are parsed by a user interface management system that uses state transitions rather than a formal grammar specification. Incremental modifications to the expression are updated on the display. CaminoReal has a pseudo-parser for input actions; discussions indicate that formal parsing is possible; we intend to pursue this . . . 3.2. Character sets interesting issue, finessed by Xerox character set complete binding of character codes to positions in large (2- or 3-byte) code space each font potentially provides all glyphs in Xerox character set question: how to input symbols beyond keyboard keytops? modifying key, such as control or meta visible keyboard selection or menus of selectable characters name lookup translations An interesting challenge for other mathematical expression editors is the treatment of special symbols. This issue is finessed in CaminoReal through the use of the Xerox character code standard, in which a unique code (a 1-, 2-, or 3-byte code) has been assigned to all multilingual and technical symbols in the standard. Therefore, mathematical notation symbols and foreign language symbols are known in each type font at specific character code locations. The Xerox character code standard is comprehensive and incorporates many international standards, such as ISO and JIS, and several de facto standards, such as the AMS TeX math symbols. Nonetheless, there remains the challenge of identifying and inputing symbols beyond the standard keyboard keys. CaminoReal employs a tenatative solution by using menus for special symbols and Greek letters. Other Xerox systems use virtual keyboards to map the keys into arbitrary character codes and abbreviation name lookup translations to map token identifiers into symbols. what else might one say . . . 3.3. Selecting Expressions CaminoReal allows you to select arguments (complete sub-trees) but not symbols function names (nodes or incomplete subtrees) (expressions?) some esoterica about which is which placeholder, template, replace, wrap ISSUE: cannot select operators: example of summation symbol: select any of subexpressions or entire expression cannot select only the sigma symbol (empirical experiment proves you cannot enter summation with no upper limit in CaminoReal!) current system forces you to have different templates for each example ISSUE: no editor conversions between notations: changing operator without changing all of the operands issue of semantics vs appearance of mathematics expressions are trees operators and atoms selection operations to navigate through tree four selection types: primary, copy (secondary), move (pending delete), keyboard type-in (caret) select with mouse or keyboard pointing finds smallest selection using bounding box of subexpressions 3.4. Entering Expressions expressions created using two schemes: replace or wrap using two devices: mouse or keyboard some effort made to emulate operator precedence. not very natural replace replaces primary selection with new object, selected from menu wraps wraps primary selection by deleing selection and replacing with a template and then fill first placeholder with deleted selection example: select a, wrap with + using template x+x, replace first place holder with a, becomes a+x use pop-up menu for type of object for some types, value comes from Tioga text selection for variables, name comes from Tioga text selection for parseRope option, object comes from parsing Tioga text selection (useful for command files?) for objects with dimensions, e.g. matrix, dimension comes from another menu using keyboard keyboard input replaces primary selection some types typed directly: integer, real, variable names some operators are mapped onto the keyboard sum, difference, negation, product, subscript/superscript, parens, braces, factorial, exists, for all, logical, relations pseudo infix implies automatic wrapping of templates around expressions selection operations, especially select parent, very useful to avoid requiring mouse to make suitable selections example of x^2+1=0 unary operators have suitable wrapping behavior, either prefix or postfix example of ($x 4. Expressions and Documents (becomes 3.3) desire is to integrate mathematical content into document as a document object CaminoReal implementation limited by Tioga to creating passive artwork objects strategy is to have several CaminoReal tools for editing and manipulating mathematical objects convenience operators to deal with math objects in Tioga documents stuff math object into Tioga document (ToTioga) extract math object from Tioga document (FromTioga) [ignore issue of scaling for now] 5. Computations in the small. Manipulation of expressions CaminoReal provides algebra service in two ways: some resident algebra facilities (both a research project and a convenience) some network-accessible algebra services 5.1 Cedar AlgebraStructures package domain oriented expressions, variables, integers, booleans, rationals, reals, complex numbers domain structures structure of specified domain set, family of sets, sequences, vectors, matrices, polynomials therefore, matrix of matrix of polynomials of sequence of variables of rationals evaluation of mathematical expressions to some domain result [recording succession of evaluations is the basis of math proofreader and audit trails] evaluation choices evaluate selection in place evaluate Tioga math object in place (does from, eval, to operations) operator choices operation choices appropriate for domain displayed in pop-up menu operation choices for working domain environment one global environment for CaminoReal across all documents, sigh! introduce assignment (name binding operator) metafunction to kill assignments 5.2 Computation from algebra servers paradigm is: select expression, or sub-expression, invoke algebra system, result replaces selection requires converting selection to abstract syntax (I presume) appropriate for algebra system example of integral, abstract syntax for SMP and Reduce, result from Reduce This allows the user to used different systems without knowing the different command names and styles used in each. However this is rather primative at the moment 5.3. Issues for future work on computation extension of CaminoReal to multiple algebra requests outstanding a potential solution outlined using strips or panes and mulitplexing requests to the algebra server (have we discussed an algebra server?) issue of multiple algebra processes keep them to a minimum (Unix process overhead, I guess) each process starts with clean state or environment results returned via stream or file depending on reliability constant: 30 secs issue of naming algebra system functions give job to user interface! issue of exception reporting issue of user query to status of running algebra process issue of caching communication streams for conversation 6. Computed Documents. Managing relationships between expressions 6.1 Introduction with assignment we get access to environment with evaluation we get access to computation sequence of math objects in Tioga documents gives us compuational documents mathematics can be assigned to environment and computed on-the-fly two uses immediately obvious: spreadsheets or mathematical form letters mathematical proofreading 6.2. MathEval on/off global flag to enable computation before painting Tioga math object 6.3 SpreadSheets What is a SpreadSheet Cells. Value, formulae and display Rectangular layout. A grid with simple naming Manipulation. Point and select to construct a formulae Evaluation. Dependency graph to evaluate cells. Read write locks Extensions Free Form Layout. Get rid of Rectangular layout Information hiding through overlapping cells Information hiding through hypertext. Dependency graph spans note card. Algebra. Allow non-numerical manipulations 6.4 Audit Trails Audit trails are a view of the dependency graph 6.5 MultiDocuments 6.6 Hypertext