<<12Hudson.tioga>>
    <<Rick Beach, February 13, 1987 8:58:22 pm PST>>
UIMS Support for Direct Manipulation Interfaces
Scott E. Hudson
Department of Computer Science
University of Arizona
Tucson, AZ 85721
Abstract
    Recently direct manipulation interfaces have received a lot of attention.  The direct manipulation paradigm seems to offer 
    significant advantages, particularly for novice users.  In this paper we will consider what direct manipulation is, what this 
    means to user interface management systems, and what we should do about it.  We will conclude that to support direct 
    manipulation, syntax should be de-emphasized and decentralized, the presentation component of the system should be made more 
    flexible than most exiting systems, and feedback, particularly semantic feedback, is very important and should be supported in 
    a much more automatic fashion.
Introduction
    User interface management systems have traditionally been designed to support the conversational metaphor of user interfaces.  
    Under this metaphor, the interface is seen as a conversation or dialogue between the user and the system.  The user expresses 
    actions to be performed in some language, and the system responds. The user then evaluates the result and issues new commands 
    in order to iterate toward some goal.  Because of this metaphor, much of previous UIMS work has centered on dialogue 
    management, and implicitly or explicitly made use of the language model of user interface [4].
    Recently, a new metaphor for user interface, that of direct manipulation [15,16], has been widely discussed.  Under this metaphor, 
    the user has the illusion of directly acting upon the objects of interest without the intermediary of the system.  This is 
    contrasted with the conversational metaphor where the objects of interest are usually referred to abstractly (i.e. by name), 
    their current state or status is displayed only on request, and the user manipulates objects only by asking the system, in some 
    language, to perform actions on them.
    In this paper we will consider how user interface management systems may better support direct manipulation.  In the next section 
    we will consider how the important aspects of direct manipulation interfaces have been characterized.  Section 3 will then 
    examine how this characterization might be related to software oriented concepts, and what UIMS research directions this 
    implies.  Section 4 will consider the implications of the conclusions drawn in section 3 with respect to a model of the 
    architecture of a UIMS.  Section 5 will discuss some constraints imposed by issues of ease of programming, and finally, section 
    6 will provide conclusions.
What is direct manipulation?
    Direct manipulation interfaces are probably best understood by example.  One good example is the Apple Macintosh desktop interface 
    for file manipulation.  Figure 1 shows a sample screen from the Macintosh.  In this interface, the objects of interest, in this 
    case files and application programs, are displayed to the user in iconic form. Actions are expressed primarily by mouse 
    movements and button presses.  For example, to delete a file, the user points at the icon representing the file, presses the 
    mouse button down, and while holding the button down, drags the icon into another icon representing a trash can.  Similarly, if 
    the user decides to change the volume of the system's bell, he or she uses the slider on the control panel shown in figure 1 
    instead of issuing a command.  Overall, many common actions are performed by pointing at or dragging objects on the screen.  
    Other operations are performed using menus. Textual command language or other obviously syntactic mechanisms are never used.
    .sp 3.25i
    Figure 1.
    We have briefly characterized direct manipulation in terms of examples and in terms of a user's illusion of acting directly upon 
    the objects of interest.  However, this characterization is much too vague to be of much use in building systems.  Shneiderman, 
    who coined the term direct manipulation, gave a more concrete characterization by suggesting that a direct manipulation 
    interface should have [15]:
    ·    Continuous representation of the objects of interest.
    ·    Physical actions or labeled button presses instead of complex syntax.
    ·    Rapid incremental reversible operations whose impact on the object of interest is immediately visible.
    In addition to Shneiderman's description, Hutchins, Hollan, and Norman [9] give a more thorough and abstract treatment of direct 
    manipulation.  They characterize direct manipulation in terms of a ``feeling of directness'' on the part of the user.  They go 
    on to discuss a number of distinct aspects of directness, in particular what they call ``engagement'' and ``distance.''
    Engagement is the feeling of dealing ``... not with programs, not with the computer, but with the semantic objects of our goals 
    and intentions.'' Distance can be thought of as a measure of ``... the distance between one's thoughts and the physical 
    requirements of the system under use.'' Hutchins et. al. further characterize distance in terms of ``Semantic'' and 
    ``Articulatory'' distances.
    Articulatory distance concerns the actual form that communication takes; for example, the choice of type and mode of use of input 
    devices, or the kinds of images drawn for the user.  A small articulatory distance results when the input techniques and output 
    representations used are well suited to conveying the required information.  Articulatory distance is also decreased when the 
    from of inputs and outputs relate to the semantic concepts of the underlying conceptual model.
    Semantic distance involves the ease with which the user can express desired actions within the concepts of the system.  Is it 
    possible to express the concepts of interest concisely? Semantic distance also involves a measure of how closely the user's 
    conception of the task domain matches that of the system.  Consequently, it is not an interface issue alone, but involves the 
    application component of the system as well.
    To summarize Hutchins et. al., direct manipulation involves a feeling of directness on the part of the user.  This feeling is 
    characterized by three aspects:
    ·    Engagement - the feeling of communicating with the objects of interest directly.
    ·    Articulatory distance - the degree to which the form of communication with the system reflects the application objects and 
    tasks involved.
    ·    Semantic distance - the degree to which the semantic concepts used by the system are 1) compatible with those of the user and, 
    2) can be used to easily accomplish the user's goals.
UIMS implications
    The characterizations of direct manipulation made by Shneiderman, and Hutchins et. al. are useful in understanding what makes a 
    direct manipulation interface.  However, these characterizations are expressed in terms of end results  what the user sees  
    rather than in terms of the software concepts necessary to implement them.  In this section we will begin to explore what the 
    implications of direct manipulation are for the architecture and implementation of user interface management systems.
    We will consider each of the aspects of directness discussed by Hutchins et. al. in turn.  The concept of engagement involves a 
    feeling on the part of the user that he or she is dealing with the objects of interest directly instead of through a third 
    party.  One barrier between the user and objects of interest can be syntax.  Syntax of some form is needed to communicate the 
    user's intentions to the system.  However, a single overall syntax for the interface as a whole implies communication with the 
    system rather than communication directly with the individual objects of interest.  This reduces engagement.
    Following Shneiderman, we can say that syntax should be in terms of individual objects, should be as simple as possible, and should 
    involve physical actions such as pointing or dragging instead of more linguistic concepts.  This implies that previous systems 
    based on transition networks [5,10,18] or grammars [12,13] which control the overall interface are not good candidates for 
    supporting direct manipulation.  On the other hand, the techniques used in these systems might still be adapted to the task of 
    handling the remaining syntax of communicating the user's intentions to individual objects.  To adapt these techniques will 
    require that the state of multiple dialogues (potentially one for each object) be maintained at once, and that the user be 
    allowed to freely move between these multiple dialogues with simple actions such as moving a mouse.
    In addition to minimizing syntax and turning to a multiple dialogue model of interface, another way of increasing engagement is to 
    provide representations of objects on the screen which reflect the behavior of objects from the application domain.  One of the 
    best ways to do this is through feedback.  Acting on objects in the application domain almost always has consequences.  
    Allowing the screen representation of these objects to immediately reflect these consequences greatly adds to the illusion that 
    the representation of the object is the object.
    Traditionally feedback has been characterized as lexical, syntactic, or semantic.  However, providing the high level of feedback 
    necessary to enhance engagement often requires merging these levels.  In particular it is often necessary to provide semantic 
    feedback even at what would normally be considered a lexical level.  A good example of this involves dragging.  Dragging an 
    object on the screen is normally a lexical operation.  However, since moving an object has a semantic consequence, engagement 
    can be greatly increased by providing feedback about potential consequences as dragging is performed.  For example, in the 
    Macintosh desktop interface in some cases, such as putting a file in a folder, one may drag objects into other objects.  In 
    other cases dragging one object on top of another has no special semantic consequence.  The semantic difference between these 
    lexically identical operations is clarified by highlighting objects when something is to be placed in them, and doing nothing 
    when something is to be placed on them.  This kind of feedback, along with such techniques as semantically driven gravity 
    fields or sophisticated snapping techniques [2], can greatly enhance the intuitive feeling of manipulating objects directly.  
    However, providing this capability in a general way could be quite expensive, since we must reflect the arbitrary semantics of 
    the application domain.  Providing semantic feedback, particularly when it manifests itself at the lexical level, is one of the 
    major challenges UIMS's must face to support direct manipulation.
    The next aspect we will consider is articulatory directness.  Articulatory directness concerns the actual form of communication 
    between the user and the system.  Consequently, it impacts the presentation component of a UIMS most strongly.  On the output 
    side, many UIMS's provide a good deal of flexibility in how images are rendered on the screen by allowing access to a general 
    set of graphics capabilities.  However, on the input side, systems are rarely as flexible.  Usually only a fixed set of 
    interaction techniques is supported.  To achieve a high level of articulatory directness for input, it will be necessary to 
    tailor the input techniques used to the objects which they apply to.  It is also important to realize that, just as application 
    objects are normally structured objects composed from smaller objects, it is also often necessary to structure input techniques 
    by building complex techniques from simpler ones.  This implies the need for some mechanism which can combine the available 
    physical input devices into composite input techniques or abstract devices as was done by Anson [1].
    Finally we turn to the issue of semantic directness.  Achieving a high degree of semantic directness involves matching the semantic 
    concepts of the system to those of the user.  Discovering how the user or a potential group of users will structure the tasks 
    and concepts of the application is outside the domain of the UIMS.  What the UIMS must support is the ability to structure the 
    application in a flexible manner.  This flexibility is usually provided by treating the application as a set of semantic action 
    procedures written in a general purpose programming language.  This approach to the application interface allows the 
    application to be structured as needed within the general purpose programming language.  However, it has drawbacks in a direct 
    manipulation context.  In particular, to improve engagement, we would prefer to model the application as a set of objects whose 
    properties and behaviors are being represented on the screen to the user.  When the objects change, feedback is generated to 
    reflect those changes in the representations displayed to the user.
    While this object oriented approach is only conceptual and can be implemented using the conventional approach of simply calling 
    semantic action routines, it then becomes the application's responsibility to provide all semantic feedback.  As we have seen, 
    semantic feedback which occurs in a lexical context can add greatly to the degree of engagement exhibited by the interface.  
    Under the semantic action routine approach to the application interface either the application must be involved in the lexical 
    details of the interface, or we must forego this form of feedback.  Because of this limitation, we believe that a shared object 
    model of the application interface is more appropriate for direct manipulation interfaces.  Under this model, application 
    objects are accessible to both the application and the interface.  When changes to the objects are made by either party the 
    other is notified.  This allows semantic feedback to be handled by the UIMS and hence allows for more automatic support to be 
    provided.
    To summarize conclusions about UIMS support for direct manipulation:
    ·    Syntax should be minimized, using physical actions such as pointing and dragging in preference to more syntactic concepts.
    ·    Syntax should be expressed in terms of individual objects rather than the system as a whole.
    ·    Feedback, particularly semantic feedback is very important, and needs better support.
    ·    Flexibility in the presentation component is important; particularly the ability to design specific interaction techniques and 
    combine these techniques into abstract devices.
    ·    An application interface based on shared objects is preferable to the conventional application interface based on semantic 
    action routines.
Architectural implications
    The Seeheim UIMS model [5,6] contains 3 components as illustrated in figure 2.  The points we have outlined above impact each of 
    these components.  The most obvious impact on the model regards the dialogue management component.  Since syntax is being 
    minimized we feel a model without an explicit dialogue component, as shown in figure 3, is more appropriate.  In a direct 
    manipulation setting, the syntactic component of the system is not entirely missing.  However, because of the emphasis taken, 
    syntax is primarily at a lexical or interaction technique level, as well as at a semantic level where we determine the current 
    legal operations based on the state of the objects being interacted with.  It should be noted that minimizing syntax is a 
    current trend among a number of UIMS researchers.  Some have gone as far as to handle all syntax by means of defaults [14].
    .sp 1.5i
    Figure 2.
    In addition to the dialogue management component, the above conclusions also strongly impact the application interface component.  
    We have proposed that the application interface component be structured as a set of shared objects.  This is done in order to 
    support semantic feedback within the UIMS.  In addition to simple shared objects, it would also be useful to store information 
    about the semantics of the objects. One way to do this is to store the objects as active entities, that is, entities which do 
    not simply passively store data, but also react to changes in ways that reflect the semantics of the application domain.  
    Active objects allow a convenient vehicle both for implementing semantic feedback and for automatically notifying the 
    application when objects change.  The author has implemented a system based on this approach [7,8] as has Sibert et al. [17].
    .sp 1.5i
    Figure 3.
    The final area of interest is the presentation component of the system.  Many UIMS's handle output quite flexibly by allowing the 
    application access to a fairly general graphics package.  However, this has the drawback that the application must get involved 
    with the details of presentation.  Also, on the input side, much less flexibility is normally provided, and there is often 
    little connection between outputs and inputs other than being able to pick a component of the output.  In a direct manipulation 
    framework, inputs and outputs must work together to produce the illusion of directly acting upon the objects of interest.  For 
    example, the elements of output need to be manipulated in more sophisticated ways than simple picking (e.g., dragging with 
    constrained movements or semantically driven highlighting).  Further, if the UIMS is to automate all or part of the task of 
    semantic feedback it must also accept more responsibility for automating screen update, and for the creation of outputs.
    One approach to this problem is to abstract inputs and outputs into presentation plans which describe how to create presentations 
    based on a series of parameters.  This was done for outputs in [3,13] and for a unified description of inputs and outputs in 
    the author's system.  The presentation component of the system is responsible for managing the graphical images and input 
    techniques described by the plans for all active presentations.  Whenever the parameter values controlling a presentation are 
    changed, either by the application directly, or via the application interface, the presentation component can derive the screen 
    updates which are necessary.  This scheme forms a convenient framework for supporting semantic feedback, since it allows 
    flexible presentations to be controlled by a small number of semantically significant values.
Constraints imposed by other issues
    Thus far we have concentrated only on supporting direct manipulation interfaces.  However, direct manipulation has limitations and 
    drawbacks, and there are other issues that impact the design of a UIMS.  In particular, we are concerned with the complexity 
    faced by the application programmer when using the system and the difficulty of learning this can cause.  Experience with 
    existing UIMS's has shown that systems which are hard to learn are not used despite the advantages they might offer [14].  
    Unfortunately, many of the directions proposed for creating UIMS's which better support direct manipulation also create more 
    complexity.  One solution to this problem is to apply user interface technology to UIMS's and allow specification by 
    demonstration as was done in [11].  Currently we are faced with a choice between systems which are very easy to program but 
    produce simple or stylized interfaces or systems that are complex to program but can support better interfaces.  What is needed 
    is a synthesis of these choices.  However, this will only come after exploration of the alternatives at both ends of the 
    spectrum.
Conclusions
    In this paper it has been shown what direct manipulation interfaces are, how user interface management systems might better support 
    them, and the research directions this indicates.  Conclusions include the fact that syntax should be minimized and 
    decentralized, that the presentation component needs better support for describing new input techniques, and that semantic 
    feedback is a central issue which affects several aspects of the system.
References
[1]    Anson, E. The device model of interaction, Proceedings of SIGGRAPH'86 (Dallas, Texas, August 1822, 1986). In Computer 
Graphics 20, 4 (August 1986), 107114.
[2]    Bier E. and Stone, M. Snap-Dragging. Proceedings of SIGGRAPH'86 (Dallas, Texas, August 1822, 1986). In Computer Graphics 
20, 4 (August 1986), 233240.
[3]    Clemons, E.K. and Greenfield, A.J.  The SAGE System Architecture: A System for the Rapid Development of Graphics Interfaces 
for Decision Support. IEEE Computer Graphics & Applications 5, (November 1985), 3850.
[4]    Foley, J.D. and van Dam, A. Fundamentals of Interactive Computer Graphics, Addison-Wesley, 1982.
[5]    Green, M. The University of Alberta user interface management system, Proceedings of SIGGRAPH'85 (San Francisco, Calif., 
July 2226, 1985). In Computer Graphics 19, 3 (July 1985), 205213.
[6]    Green, M. Report on Dialog Specification Tools. Computer Graphics Forum 3, (1984), 305313.
[7]    Hudson, S.E. and King, R. A Generator of Direct Manipulation Office Systems. ACM Transactions on Office Information Systems 
4, 2 (1986), 132163.
[8]    Hudson, S.E. A User Interface Management System Which Supports Direct Manipulation. PhD thesis, University of Colorado, 
Boulder, Colorado, August 1986.
[9]    Hutchins, E.L., Hollan, J.D. and Norman, D.A. Direct manipulation interfaces. In User centered system design, Norman, D.A. 
and Draper, S.W. (eds.), Lawrence Erlbaum Associates, Hillsdale, NJ, 1986, 87124.
[10]    Jacob, R.J.K. An Executable Specification Technique for Describing Human-Computer Interaction. In Advances in 
Human-Computer Interaction, Hartson, H.R. ed., Ablex, Norwood, NJ, 1985, 211242.
[11]    Myers, B.A. and Buxton, W. Creating Highly Interactive and Graphical User Interfaces by Demonstration, Proceedings of 
SIGGRAPH'86 (Dallas, Texas, August 1822, 1986). In Computer Graphics 20, 4 (August 1986), 249258.
[12]    Olsen, D.R. Jr. and Dempsey, E. SYNGRAPH: A Graphical User Interface Generator, Proceedings of SIGGRAPH'83 (Detroit, Mich., 
July 2529, 1983). In Computer Graphics 17, 3 (July 1983), 4350.
[13]    Olsen, D.R., Dempsey E.P. and Rogge R. Input-Output Linkage in a User Interface Management System, Proceedings of 
SIGGRAPH'85 (San Francisco, Calif., July 2226, 1985). In Computer Graphics 19, 3 (July 1985), 191197.
[14]    Olsen, D.R. MIKE: The Menu Interaction Kontrol Environment, BYU Tech Report, Brigham Young University, Provo, Utah, 1986.
[15]    Shneiderman, B. The Future of Interactive Systems and the Emergence of Direct Manipulation, Behaviour and Information 
Technology 1, (1982), 237256.
[16]    Shneiderman, B. Direct manipulation: a step beyond programming languages, IEEE Computer 16, 8 (1983), 5769.
[17]    Sibert, J.L., Hurley, W.D. and Bleser, T.W. An Object-Oriented User Interface Management System, Proceedings of SIGGRAPH'86 
(Dallas, Texas, August 1822, 1986). In Computer Graphics 20, 4 (August 1986), 259268.
[18]    Wasserman, A.I.  Extending State Transition Diagrams for the Specification of Human-Computer Interaction. IEEE Transactions 
on Software Engineering SE-11, (August 1985), 699713.