Keith A. Lantz
Department of Computer Science
Stanford University
Stanford, CA 94305
Many contemporary user interface management systems suffer from the lack of adequate operating system support for multi-process structuring. They either adopt a single-process server approach, resulting in monolithic code, or are implemented as run-time libraries, resulting in a high degree of redundancy and complex synchronization problems. This paper, on the other hand, describes a methodology that takes advantage of lightweight processes and fast interprocess communication to structure user interface software as multiple cooperating processes. It demonstrates that such an approach can lead to exceptional flexibility — with respect to the addition of new functionality, in particular — without sacrificing performance.
2. Target environment
Imagine an environment consisting of Sun- and VaxStation-class workstations with a mix of monochrome, gray-scale, and color displays, some with special peripherals (such as speech-processing units, digital scanners, or digital pen-and-tablet sets), all employing shared file servers, backend computation engines, and other specialized facilities. All workstations are interconnected by a high-speed institutional (inter)network. Institutional networks are in turn interconnected by long-haul networks using T1 (or equivalent) interconnections.
From the point of view of a user sitting in front of one of these workstations, the following functionality is available:
· The user may engage in multiple activities simultaneously, each of which may run local to the workstation, on a remote host, or distributed across several hosts.
· The user may interact with each application independent of the others, possibly using multiple communications media simultaneously.
· The user may ``transfer'' one application's output to another application's input, independent of data types involved. For example, data is automatically converted from text to voice to graphics, whenever possible.
· The user may engage in a real-time conference with any number of other users, and may invoke any available applications from within the framework of the conference. The output from the application is presented at every workstation, and each participant can provide input.
· Typically, the user interacts with applications via techniques of her own choosing. She may request multiple independent
presentations of the same underlying data, each associated with different interaction techniques.
1 Similarly, in the context of a real-time conference, each user may request different views of the shared data. The mapping between application abstractions and interaction techniques is performed by the system.
2
1 The term ``presentation'' is meant to allow for media other than display devices.
2 Some applications rely on being able to use a particular interaction technique or communications medium (or, worse, a particular device), which may be different from the techniques and media the user prefers to use. For example, it does not make much sense to render an image as a menu or to generate a voice description (of the bits).
· The system maintains sufficient context information to provide effective help and error recovery. This might include intelligent modeling of the user, such that the system is capable of taking action on behalf of the user.
· Complete backward compatibility is provided, such that regardless of the evolution of the user interface, the user may always access those applications to which she previously had access.
From the point of view of a user interface developer, on the other hand, the key attributes of the user interface software are as follows:
· It is straightforward to port the software to another computer or to another operating system, not just to another device.
· It is straightforward to configure (or extend) the software to accommodate different users, or classes of users, or application environments. Both static and dynamic configuration are supported.
These two characteristics imply in turn that the interface software is modular — following basic principles of good system design.
Naturally, these characteristics represent the ideal, and achieving all of them at once may be impractical, if not impossible. Nevertheless, they represent a noble goal. Further elaboration on each of them, including (additional) motivations, may be found in [13].
3. Implementation methodology
There are indeed some (if remarkably few) examples of human-computer interfaces that meet many of the ``user goals'' presented above. Unfortunately, the associated software rarely meets the ``developer's goals'', even in those situations where the software conforms to contemporary user interface management system (UIMS) or user interface toolkit models [21,22,26]. For example, most (server-based) user interface management systems are too monolithic, do not interface well with standards graphics packages, and provide inadequate multi-tasking support. Most (library-based) user interface toolkits, on the other hand, tend to result in redundant code and to introduce complex synchronization problems. Moreover, most approaches of either ilk deal only with traditional media, such as keyboards, displays, and mice.
We are addressing these deficiencies via a hybrid of the UIMS and toolkit approaches, one that can accommodate ``new'' media. The implementation methodology is based in large part on experience with the user interfaces for two major distributed operating systems, Rochester's Intelligent Gateway [14] and Stanford's V-System [2] (articles that focus on the related user interfaces include [10,11,15,18,20]). Yet another implementation, which more closely follows the methodology presented here, is underway.
3.
1 Decomposition of function
The adopted methodology is based on a derivative of the linguistic analysis model of human-computer interaction [6]. The derivative model introduces the notion of ``medium'' (in addition to ``device''), accounts for multi-tasking, and is stated in terms intended to provide better guidance for the developer. As discussed in a companion report in this issue [19], there are four levels of I/O:
· device-dependent I/O
· media-dependent I/O
· media-independent I/O
· workstation management requests
and five basic hardware/software components:
· hardware devices
· workstation agent
· workstation manager
· dialogue manager(s)
· applications
picture(size 4.21in, postscript "figures/uia-small.vps-fig")])
Figure 1. The basic user interface architecture.
The manner in which these components are interconnected by the various types of I/O is depicted in Figure 1. This figure differs from the corresponding one in [19] in the decomposition of the application into a
frontend and a
backend. There are two principal ways to view this split:
· The frontend, if it exists, is responsible for implementing application-specific interaction techniques that augment the facilities providing by the dialogue managers and workstation agent, whereas the backend implements the semantics of an application — its algorithms. Splitting the two components can be especially advantageous in the context of distributed applications, where the frontend remains on the user's workstation while the backend is distributed on one or more remote hosts.
· The frontend is actually a protocol translation agent [9,16] for an application that would otherwise be incapable of communicating with the user interface software. Such agents, however they are implemented, are a fundamental mechanism for providing backward compatibility.
3.2 Flow of control
The basic flow of control is as follows: When the system is initialized, the workstation manager creates a default dialogue manager. At a later time, the user may interact with the workstation manager to create other dialogue managers or to create new windows for existing applications. Typically, however, one dialogue manager will suffice; it interacts with the user (via the workstation agent) to determine what applications (or methods) should be invoked, to specify the arguments to those commands (or methods), and to invoke frontend and backend services as necessary. Naturally, any number of applications may be running at any time; the user controls the input focus via the workstation manager.
This flow of control is typical of what has sometimes been referred to as external control, where the user guides the interaction [26]. Unfortunately, the vast bulk of existing applications exert internal control; they drive the interaction. Such applications are accommodated in two ways. First, all requests for arguments can be directed to the dialogue manager rather than to a specific workstation agent (or device driver). The dialogue manager can then determine the best interaction technique to use to specify each argument — just as it determines the interaction technique(s) for specifying a complete command. Second, a pre-existing ``shell' or ``executive'' may be invoked as an application, from which the ``real'' applications may then be invoked.
3.3 A multiplicity of dialogue managers
From the discussion thus far the reader may have inferred another difference between the model assumed here and that presented in [19]: support for a hierarchy of dialogue managers, rooted at the workstation manager. The manner in which the user interacts with the workstation manager — the meta-dialogue with the system — may be fundamentally different than the manner in which the user interacts with a typical application. Therefore, while the workstation manager remains a client of the workstation agent, it is regarded more as a (special) dialogue manager than as an application. Similarly, pre-existing shells invoked by a dialogue manager can themselves be regarded as dialogue managers, however antiquated. In the remainder of this paper, we will continue to use different terms for these three basic types of dialogue manager, but the reader should remain aware of their underlying similarities.
3.4 Concurrency and multi-process structuring
Another implication of the previous discussion is the availability of facilities for run-time ``instantiation'' of code, as when the workstation manager creates dialogue managers. Although this can be achieved in a shared-memory environment via dynamic linking facilities, the more general approach is to use multiple processes. Aside from support for run-time instantiation and dynamic binding, there are many other motivations for the use of concurrent programming techniques in the construction of interactive software. The reader is referred to the companion report in this issue [19] for a general discussion of motivations and techniques. It suffices to say here that we have adopted the ``team of processes'' approach discussed in that report.
In particular, our implementation methodology relies on the availability of lightweight processes and fast interprocess communication — through message-passing, shared memory, or both. Each of the four major software components discussed above — applications, dialogue manager(s), workstation manager, and workstation agent — consists of one or more processes sharing a single address space. Processes sharing the same address space may synchronize their access to shared data with spin locks or messages. Processes in different address spaces communicate solely via messages.
4. Fine structure
4.1 The workstation agent
First, for purposes of this discussion, a device that implements both input and output is treated as consisting of two separate devices.
In order to be responsive to all input devices, each is associated with an independent process that accepts input only from its associated device. However, a great deal of experience has been accumulated suggesting that it is easier to write applications that accept a single stream of input events rather than multiple streams, one for each device. In our model, of course, most applications are not (or need not be) aware of individual devices or even media, but the dialogue manager must be. Therefore, it is desirable that the workstation agent optionally serialize the input from multiple devices that is logically part of the same input stream (dialogue). This suggests the addition of an additional process, the input demultiplexor.
On the output side the situation is somewhat different; the system is driving the device. This means that it is not necessary to have a separate process waiting at all times to serve that device. On the other hand, it is desirable to have separate processes for each communications medium — an audio manager, a video manager, and a display manager, say. The display manager, in particular, would provide all the ``standard'' graphical output facilities of the system — the output facilities of GKS or PHIGS, for example. In fact, it should be capable of providing multiple sets of facilities at the same time.
The workstation agent also must provide the basic support for multitasking. Specifically, the input demultiplexor must be cognizant of multiple input streams and each output handler must be cognizant of multiple output streams. In the case of the display manager, for example, it should support multiple display files. Input and output streams should be kept separate, thus allowing an application to have one input stream but multiple output streams. The one exception to this rule concerns feedback.
Recent experience has shown that the feedback facilities of standard graphics packages conflict with contemporary interest in object-oriented programming and direct manipulation [5]. Specifically, very little of what is referred to as ``feedback'' relies solely on lexical information, and therefore cannot be implemented at the workstation agent level. Nevertheless, to the extent that lexical feedback is desirable, it must be possible to indicate to the input handlers when and how they should communicate with their corresponding output handlers.
The resulting structure looks like that in Figure 2, which depicts the workstation agent (TheWA [17]) currently implemented for the V-System.
picture(size 4.75in, postscript "figures/wa-structure.vps-fig")])
Figure 2. The Workstation Agent.
4.2 The dialogue manager
Analogous to the use of multiple input and output handlers, the dialogue manager employs multiple dialogue handlers, one for each ongoing dialogue. A ``master'' dialogue manager may be employed to synchronize access to the various shared databases, or traditional shared memory techniques could be employed. (Remember that all dialogue manager processes share the same address space in this architecture.) However, each dialogue helper is single-threaded, simplifying its implementation.
The information being shared includes:
user profiles: Includes the preferences, authorizations, etc. of the user.
dialogue specifications: The specification of a dialogue in an abstract, media-independent syntax.
media-independent to media-dependent bindings: The different media-dependent representations of each abstract object.
media-dependent to device-dependent bindings: The different devices (and associated interaction sequences) for each communications medium.
help information: Keyed by...
history, context, and names: This information typically will be distributed throughout the various components of the system, not just isolated in a single database.
The common algorithms being shared are those that make decisions, based on the various constraints, as to how best to interact with the user or the application at any given time. For example, when operating under external control, each dialogue helper is ``driven'' by a dialogue specification. For every abstract object, the appropriate decisions must be made as to which input events are acceptable or which output events should be generated. Typically, this entails translation from abstract (media-independent) to media-dependent representations, bearing user preferences and the like in mind. Sometimes, the dialogue specification will mandate the use of a specific medium or even a specific device.
When running an application that is exerting internal control, the dialogue helper merely accepts requests from the application rather than being driven by a dialogue specification. If the application requests input or output in a media-independent fashion, it is still possible for the dialogue helper to decide which specific interaction techniques to employ.
Finally, to the extent possible, the dialogue handlers provide syntactic feedback. Especially when under external control, the handler will do a fair amount of input processing, generating feedback directly to the various output handlers.
4.3 The workstation manager
As noted above, the workstation manager is a ``special'' dialogue manager, one that manages the meta-dialogue consisting of all ongoing dialogues. A single process implementation will suffice in this case. Since it too interacts with the user, it must have access to some of the databases discussed in the previous section. At first glance, then, it seems reasonable to place the workstation manager on the same team as the other dialogue managers; indeed, the workstation manager could then be the ``master'' dialogue manager alluded to above.
However, if the workstation manager is in the same address space as the ``normal'' dialogue manager, neither can be as easily replaced independent of the other. Moreover, as previously discussed, the meta-dialogue for which the workstation manager is responsible may be qualitatively different than the typical user-application dialogue and, therefore, the databases may not be the same after all. Therefore, we suspect that the preferred approach is to place the workstation manager in a separate address space, replicating whatever information is common between it and the normal dialogue manager team.
9. References
[1] Beach, R.J., Beatty, J.C., Booth, K.S., Plebon, D.A. and Fiume, E.L. The message is the medium: Multiprocess structuring of an interactive paint program. In Proceedings of SIGGRAPH'82 (Boston, Mass., July 2630, 1982). In Computer Graphics 16, 3 (July 1982), 277287.
[2] Berglund, E.J. An introduction to the V-System. IEEE Micro, (August 1986), 3552.
[3] Booth, K.S., Cowan, W.B. and Forsey, D.R. Multitasking support in a graphics workstation. In Proc. 1st International Conference on Computer Workstations, (November 1985), IEEE, 8289.
[4] Brown, M. and Heck, M.. Understanding PHIGS: The Hierarchical Graphics Standard. Megatek Corporation, San Diego, CA, 1985.
[5] Dance, J.R. et al. Report on run-time structure for UIMS-supported applications. In this issue, Computer Graphics 21, 2 (April 1987).
[6] Foley, J.D. and van Dam, A. Fundamentals of Interactive Computer Graphics, Addison-Wesley, 1982.
[7] Gosling, J. SunDew: A distributed and extensible window system. In Methodology of Window Management, F.R.A. Hopgood, et al. (eds.), Springer-Verlag, 1986. 4758.
[8] Green, M. The University of Alberta user interface management system, Proceedings of SIGGRAPH'85 (San Francisco, Calif., July 2226, 1985). In Computer Graphics 19, 3 (July 1985), 205213.
[9] Green, P.E. Jr. Protocol conversion. IEEE Transactions on Communications COM34, 3, (March, 1986), 257-268.
[10] Lantz, K.A. An architecture for configurable user interfaces. In Foundation for Human-Computer Communication, K. Hopper and I.A. Newman (eds), North-Holland, 1986. 257275.
[11] Lantz, K.A. Command interaction in distributed systems. In Proc. Fall COMPCON, (September 1980). 2532.
[12] Lantz, K.A. An experiment in integrated multimedia conferencing. In Proc. CSCW '86: Conference on Computer-Supported Cooperative Work, (MCC Software Technology Program, December, 1986). 267275.
[13] Lantz, K.A. On user interface reference models. SIGCHI Bulletin 18, 2, (October, 1986), 3642.
[14] Lantz, K.A., Gradischnig, K.D., Feldman, J.A. and Rashid R.F. Rochester's Intelligent Gateway. Computer 15, 10, (October 1982), 5468.
[15] Lantz, K.A. and Nowicki, W.I. Structured graphics for distributed systems. ACM Transactions on Graphics 3, 1, (January 1984), 2351.
[16] Lantz, K.A., Nowicki, W.I., and Theimer M.M. An empirical study of distributed application performance. IEEE Transactions on Software Engineering SE11, 10, (October 1985), 11621174.
[17] Lantz, K.A., Pallas, J., and Slocum, M. TheWA beyond traditional window systems. Internal Memo, Distributed Systems Group, Department of Computer Science, Stanford University.
[18] Lantz, K.A. and Rashid, R.F. Virtual terminal management in a multiple process environment. In Proc. 7th Symposium on Operating Systems Principles, (December 1979). ACM, 8697.
[19] Lantz, K.A, et al. Reference models, window systems, and concurrency. In this issue, Computer Graphics 21, 2 (April 1987).
[20] Nowicki, W.I. Partitioning of Function in a Distributed Graphics System. PhD thesis, Stanford University, 1985.
[21] Olsen, D., Buxton, W., Ehrich, R., Kasik, D., Rhyne, J. and Sibert, J. A Context for User Interface Management, IEEE Computer Graphics and Applications 4, (Dec 1984), 3342.
[22] Pfaff, G.E., ed. User Interface Management Systems, Springer-Verlag, 1985.
[23] Rashid, R.F. Threads of a new system. UNIX Review 4, 8, (August 1986), 3649.
[24] Scheifler, R.W. and Gettys, J. The X window system. To appear in ACM Transactions on Graphics.
[25] Tanner, P.P., MacKay, S.A., Stewart, D.A. and Wein, M. A Multitasking Switchboard Approach to User Interface Management, Proceedings of SIGGRAPH'86 (Dallas, Texas, August 1822, 1986). In Computer Graphics 20, 4 (August 1986), 241248.
[26] Thomas, J.J. and Hamlin, G. Graphical input interaction technique (GIIT) workshop summary, Computer Graphics 17, 1 (January 1983), 530.