Distribution of Intelligence in the Etherphone System Daniel C. Swinehart Computer Science Laboratory Xerox Palo Alto Research Center Abstract: Users can control the Etherphone from any telephone or from any workstation. Knowledge of the users' identity and user preference specification form the basis for a number of capabilities not usually available in telephone systems. Several examples illustrate the kinds of facilities that exist or are planned for the Etherphone system. The system has been designed to support the ready implementation of this kind of activity. Most telephone options and other voice services can be implemented on a workstation in the Etherphone environment. Those that prove particularly useful may migrate to a server (not necessarily the server), (a) to make them available from standalone instruments; (2) to make them available at little effort to other environments; (3) to improve performance (off-load workstation, transfer to higher-performance server -- questionable). For server-based functions, permanent user-specific options can be stored in a database. A number of simple instances from the existing system are listed to indicate how this works. We conclude that these methods form a flexible basis for experimentation with voice in an integrated personal information environment. Notes: The examples are going to have to have a lot of implicit stuff in them, and be woven together. With care, can get a lot of information into a few examples. Unisex names, plus "he or she": Jan, Lee, Brett, Leslie, Pat, Jean, Frances, Karmen, Laverne? First-person approach, instead? Use small-type to indicate that we would have used a pronoun here?  Read the Rooms paper, and any other scenario-based ones. Want to get into section 1 or 2 stuff about how there's no need to be shackled by how telephones work, although it's important to implement the basics in a way that any user will be able to deal with. Much of what's described here is almost suggested by the richer environment.  Some of the abstract/introduction argumentation goes in the summary/conclusions section.  1. Introduction The Etherphone system is . From the start, although we've had to build a lot of specific things, the emphasis has not been on specific choices of switching, transmission, terminal equipment, hardware or network architectures, but on functionality. More specifically, aim to provide flexible environment for programming novel applications, in fully-integrated way, across wide range of voice applications: telephony, recorded voice and music, synthesized voice&music. Methods should extend to cover speech transcription, video recording/switching/transmission. Others have investigated Ethernet telephony [Sinkoskie, others], Packet voice transmission protocols [Sinkoskie, ...?, Ades, should update refs., others], high-level methods for specifying telephone and voice functions [Olympic system, Ruiz, Phoan, others], recorded voice applications [sydis, diamond, Maxemchuk, others], some products with telephone management [Sydis, Meridian, Panasonic, others]. Media lab types have had remarkable results in exploring intelligent interfaces to telephone, answering machine, office system in general. [Phone Slave, Intelligent desktop]. Only now are other experimental testbeds emerging to explore the range of user applications. [Mice, Island] We've written a lot about the goals of the system, hardware and software organization from systems standpoint [GlobeCom '83, IEEE], some specific software methodologies intended to handle voice [Managing Voice], user interfaces for less-interactive functions -- document editing and annotation, scripted documents [Ades/Swinehart, Zellweger]. Here the intent is to demonstrate how this system is used to improve the nature of the telephone in the office setting. The emphasis is primarily on telephone control and mangement functions, only secondarily recorded voice applications, for which see ibid. 2. User interface User Model Telephones as either stand-Telephones or integral components of computer workstations connected to internetwork. Functions in network include experimental PBX (under our control) for managing voice connections, w/standard access to other internal phones, outside lines, plus a variety of services: list them, but there could be more. Hands-free phone, whose speaker is used in place of a ringer. Scenarios We'll spend a day with a peripatetic professional. These concepts have to be reduced in pomposity and woven into the fabric of the scenario. Parts of the day are spent in the office, parts visiting colleagues, attending meetings, and so on. Some segments of the day require intense concentration without interruptions, others are open to new conversations. There are situations that require collaboration by telephone in order to give each collaborator access to his office files and workstation environment. He has frequent dealings with people over the telephone, many of whom have schedules as frantic as his own. Event: he's in a nearby office when phone rings. Distinctive duet weaves together his own motif and that of the caller (if another Etherphone user.) Knows then whether to ignore it, dart to his office, or ... This time, asks colleague to register him as a visitor to her office. The ring-duet is now performed on the colleague's phone, permitting the call to be answered there. Subsequent calls will also be repeated there until the visiting registration is cancelled. Pedagogical device: user-interface vignettes in the margin? Event: He's working very hard and may not want to be disturbed. Issues a command to use a very subdued audible notification (a single soft bleat) to indicate incoming calls. The caller's identity, an urgency indication, the reason for the call, shows up on the screen. If ignored, will not continue to bother. Can also reject all calls altogether, or all except calls from specified parties or about specified topics. Caller will get a visual explanation plus distinctive tone if using EP-equipped WS; otherwise a synthesized explanation urging them to (a) try harder; (b) call back whenever. Event: He must spend an hour at a colleague's workstation that has a full-color display, to prepare a presentation. As soon as he has logged in, incoming calls are routed to that workstation's telephone, and the calls he makes are identified as from him. Calls to his colleague continue to arrive there, unless that person is "poaching" elsewhere. Event: He requests that a conference call be established with several other colleagues (by buttoning their names in a hierarchical, filtered, on-line telephone directory that includes access to his friends, his organization, his company, and ultimately THE WORLD!), and that he not be bothered until all of them who are available are on the line. (He might indicate that he's willing to wait quite a while.) Each colleague's phone rings and displays an indication of what's going on, and each is given a reasonable amount of time to accept or reject the attempt. (If a long time was specified, each party can indicate the length of time he'll be available.) When all parties have accepted, rejected, or timed out. a distinctive tone sounds at each station and the original caller is given the "floor". After his opening greeting, he may enable the conversation for full conversational conferencing, or control the "floor" [Stodowsky ref and comment -- get one]. Event: He is too busy to attend a lecture, so he calls the lecture room (by name or button, of course) to create or join an ongoing conference-call that brings the lecture to any office. If the lecture were on educational TV, its soundtrack or that of any broadcast radio source at least can be distributed in the same way. Event: he and a colleague are planning the departmental budget. Each has a set of personal documents, electronic and paper, relating to it, so they'd prefer to confer by telephone to a face-to-face meeting. They may have a set of common documents on their workstation screens that they are jointly developing, as well. [Stefik ref. and comment; Colab does this for tete-a-tetes; that code plus this would make nice distributed experiment.] Can use the voice annotation facilities to record snippets of the conversation into the document [Ades/Swinehart], interactively produce and review a script [Zellweger], ad infinitum; isn't this exciting? Event: George Geargrinder and Karmen Foozle, message-taking. If Lee is calling George from inside, Lee can record the message, indicate the subject, etc., directly. Can use all of the annotation and editing capabilities outlined in [Ades/Swinehart] to compose a good one, comment on earlier work, etc.] Most of these either exist today in Etherphone system, or are readily achievable. Other capabilities, intended but requiring more work, include attendant facilities such as the Geargrinder episode, side-conversations, . There are also other features planned or existing than there was space to mention: . The idea was to indicate the range of activities we wanted to achieve. 3. Implementation Hardware Environment feels too big already PABX implemented as Etherphones, packet ethernet transmission and switching, central telephone user interface (as a convenience for wrong-language Etherphone processors) and nucleus switching control, additional control distributed to other servers and to workstations. All communications using same internetwork config. as Etherphones, RPC communications as application of Pup internetwork protocol -- no direct connection of phones to WS. Other services, either sharing telephone control processor or located on separate servers, play major roles in storage of database information, providing authentication, supporting ws functions, and so on. Use of distributed control/intelligence to implement, aid in development of user interface examples Will not describe all of the facilities required to do transmission [summary in GlobeCom, future publication], editing and annotation [Ades/Swinehart, Terry/Swinehart], or even the detailed control methods. [future publication.] Will: (a) sketch the object-oriented "conversation management" design that supports the incorp. of workstations, recording service, synthesizer, .... and provides program control; (b) indicate the basic capabilities provided for extensibility of services (object style); (c) ditto for control of important conversation-establishment/termination decisions; (d) use of database to assist with migration of facilities to server, and why we do that anyhow. (e) Perhaps something about reliance on other distributed intelligences to make the system tractable to build by two or three folks. Examples All examples follow common theme: facilities that can be developed on WS, migrated if desired to server to increase availability -- to stand-alone phones and to WS's in other environments. How visiting and poaching work? Poaching is a direct result of the way workstations and phones are represented in the system. A workstation object is associated as manager with the current user of the workstation and with the telephone object representing the adjacent telephone. When evaluating incoming calls, the manager object and its associated phone are preferred to the user's home phone. A similar activity more closely identifies the phone with the manager's owner than its own for outgoing calls. Visiting is a looser association. When call arrives for visitor, both visitor's phone and visited party's phone are included in potential conference call. The one who doesn't answer drops out, either right away or after a while. Could be implemented by WS. But in order to apply to visitors without registered WS's, impl. is done on server. Theme. How call-filtering works When WS object is registered, gets informed of incoming calls along with object implementing phone. Can accept, reject call (with/without prejudice). But normal behavior of object class implementing phones is to ring right away if not busy. Needs to be inhibited if not to get bleep. Basic mechanism is ability for WS to record in server-supported database that it is in charge of whether to ring. Has obligation to make accept/reject/ring decision in short order. Phone object class is willing to wait a few seconds after receipt of incoming call notification before concluding that WS has had its chance and reverting to default behavior, clearing DB entry until WS recovers and again enters it. Allows both more restrictive and less restrictive behavior. Same approach can inhibit default choice to reject incoming calls if line is already busy. Allows very flexible call-waiting behavior. How duets work -- already built in, but how they'd work if the WS wanted to do it. Duets are also built into the server now, but were originally prototyped from the WS. WS would compute a new tune for the callee, register it temporarily in the system database, then enable ringing. Would not operate when called WS wasn't. Now a part of the server's code. How voice ropes wandered into the Server. How much we've come to rely on a simple shared reliable database to control the activities of the system. Began with very simple facility for recording and playing back utterances. Implemented reference schemes for editing on the workstation, using separately managed auxiliary file storage. Once that worked well, cut at different interface, placing most editing functions on server. Now WS programs can be simpler, so it's easier to write them in new environments. Also, better control of atomicity of editing operations. Also allows resulting edited voice values to be played by telephones without WS being around. How the arranged conference works. Haven't done this one yet. Would be arranged by calling WS, originally (delayed version) only among participants that also had registered WS's. Call placement carries number of explanatory parameters (urgency, style, subject), and involves ability to obtain lots of status (either before or during contact attempt.) Underlying system knows what to do about dealing with multiple people being invited to join conversation, either all at once or incrementally as the call progresses. For recipients w/workstations, there'd be sufficient info in call to inhibit ringing as above, and participate in extended negotations. Participants could consult electronic calendars, either collaboratively or each on own, to assist in negotiations. Should be easy to see that the results described in scenario could be achieved with this sort of underlying facility. Probably caller's part would never migrate fully to server-based thing, but calendar and recipient functions could/should, at least for basic defaults, controlled by server database. Hard to say, since none of this exists yet. Simple version does. The broadcast meeting facility has been implemented and tested. Was a straightforward exercise and we use it all the time. [don't I wish.] 4. Conclusions Paradigm of rich distributed environment full of services and clients, with logical connections expressed in high-level languages in full pt/pt network independent of physical architecture, extends well to voice, functionally. Basic design that allows participation of many object classes, both centrally-located and distributed, has provided a flexible environment. Many of the user facilities are unmatched in existing voice products. A disappointment has been the complexities that still present themselves to the WS programmer who wants to have full access to the control facilities -- it's not a no-brainer. Another is the degree to which new areas need a lot of system-level work (like side -conversations). Extending these facilities to capture the full set of basic underlying voice facilities in a sensibly-presented voice architecture is a challenge for the future. Acknowledgments Polle 'n Larry 'n Doug 'n Susan, for the work. Whoever for reading it. Mostly later. References [1] S. Ades and D. C. Swinehart. Voice annotation and editing in a workstation environment, Proceedings AVIOS Voice Applications '86, September 1986, pages 13-28. [2] R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica 1(3):173-189, 1972. [3] A. Birrell, R. Levin, R. M. Needham, and M. D. Schroeder. Grapevine: An exercise in distributed computing. Communications of the ACM 25(4):260-274, April 1982. [4] A. D. Birrell and B. J. Nelson. Implementing remote procedure calls. ACM Transactions on Computer Systems 2(1):39-59, February 1984. [5] A. D. Birrell. Secure communication using remote procedure calls. ACM Transactions on Computer Systems 3(1):1-14, February 1985. [6] M. R. Brown, K. Kolling, and E. A. Taft. The Alpine File System. ACM Transactions on Computer Systems 3(4):261-293, November 1985. [7] R. G. G. Cattell. Design and implementation of a relationship-entity-datum data model. Xerox Palo Alto Research Center, Technical Report CSL-83-4, May 1983. [8] D. D. Clark. The structuring of systems using upcalls. Proceedings Tenth Symposium on Operating Systems Principles, Orcas Island, Washington, December 1985, pages 171-180. [9] J. Donahue and W. Orr. Walnut: Storing electronic mail in a database. Xerox Palo Alto Research Center, Technical Report CSL-85-9, November 1985. [10] J. D. Gould and S. J. Boies. Speech filingAn office system for principles. IBM Systems Journal 23(1): 65-81, January 1984. [11] J. N. Gray. Notes on database operating systems. In Bayer et al., Operating Systems: An Advanced Course, Springer-Verlag, 1978, pages 393-481. [12] B. W. Lampson and K. A. Pier. A processor for a high-performance personal computer. Proceedings 7th Symposium on Computer Architecture, La Baule, May 1980, pages 146-160. [13] B. W. Lampson. Hints for computer system design. Proceedings Ninth Symposium on Operating Systems Principles, Bretton Woods, New Hampshire, October 1983, pages 33-48. [14] B. Liskov and R. Ladin. Highly-available distributed services and fault-tolerant distributed garbage collection. Proceedings of Symposium on Principles of Distributed Computing, Calgary, Alberta, Canada, August 1986, pages 29-39. [15] N. Maxemchuk. An experimental speech storage and editing facility. Bell System Technical Journal 59(8): 1383-1395, October 1980. [16] J. G. Mitchell and J. Dion. A comparison of two network-based file servers. Communications of the ACM 25(4):233-245, April 1982. [17] National Bureau of Standards. Data Encryption Standard. Fedaral Information Processing Standard (FIPS) Publication 46, U. S. Department of Commerce, January 1977. [18] R. Nicholson. Integrating voice in the office world. BYTE 8(12):177-184, December 1983. [19] J. K. Reynolds, J. B. Postel, A. R. Katz, G. G. Finn, and A. L. DeSchon. The DARPA experimental multimedia mail system. Computer 18(10):82-89, October 1985. [20] A. Ruiz. Voice and telephony applications for the office workstation. Proceedings 1st International Conference on Computer Workstations, San Jose, CA, November 1985, pages 158-163. [21] L. Svobodova. A reliable object-oriented data repository for a distributed computer system. Proceedings Eighth Symposium on Operating Systems Principles, Pacific Grove, California, December 1981, pages 47-58. [22] L. Svobodova. File servers for network-based distributed systems. ACM Computing Surveys 16(4):353-398, December 1984. [23] D. C. Swinehart, L. C. Stewart, and S. M. Ornstein. Adding voice to an office computer network. Proceedings IEEE GlobeCom '83, November 1983. Also available as Xerox Palo Alto Research Center, Technical Report CSL-83-8, February 1984. [24] D. C. Swinehart, P. T. Zellweger, R. J. Beach, and R. B. Hagmann. A structural view of the Cedar programming environment. ACM Transactions on Programming Languages and Systems 8(4):419-490, October 1986. [25] D. C. Swinehart, D. B. Terry, and P. T. Zellweger. An experimental environment for voice system development. IEEE Office Knowledge Engineering Newsletter, February 1987. [26] R. H. Thomas, H. C. Forsdick, T. R. Crowley, R. W. Schaaf, R. S. Tomlinsin, V. M. Travers, and G. G. Robertson. Diamond: A multimedia message system built on a distributed architecture. Computer 18(12):65-78, December 1985. [27] N. Yankelovich, N. Meyrowitz, and A. van Dam. Reading and writing the electronic book. Computer 18(10):15-30, October 1985. A paper on Distributed control (user and programmer) for GlobeCom '87, due March 1987 Copyright 1987 by Xerox Corporation. All rights reserved. Draft last edited by Dan Swinehart, March 9, 1987 12:09:23 pm PST 7UIcodebQ}Q"blueandwhite" stylePB-Qa<9QGzQemQ"blueandwhite" style7  Q"blueandwhite" style  8,QQ\JPZYm