[Indigo]<Voice>Documentation>VoiceSystemTalk>VoiceTalkOct86.slides!1

Distributed Systems Support for Voice in Cedar

October 1986

Distributed Systems Support for Voice in Cedar

Douglas B. Terry
Daniel C. Swinehart
Polle T. Zellweger

Computer Science Laboratory
Xerox Palo Alto Research Center

Plot

· User interface and system overview (video)

· Project goals

original, functional goals

architectural goals

· Towards an architecture for voice integration

· Voice ropes: management of recorded voice

· Conclusions

(show videotape here.)

Original Voice Project Goals

· Taming the telephone

better placement and receipt of calls

better handling of special features

better human-assisted call handling

· Voice as data

recorded voice messages

synthesized text-to-speech messages

speech recognition (not yet)

voice-annotated documents

integration with other interactive applications

· Programmer control over all of the above

Overall Goal: A Voice Systems Architecture

· User-programmable

language-level interfaces

simple things are simple

complicated things make sense

acceptable impact on reliability

· Supports interoperability

multiple workstations and environments

multiple networks and protocols

multiple telephone transmission and switching choices

· Extensible

admits new applications

admits new services

admits new workstations, networks, . . .

Voice Architecture Layering

Interpress Artwork: []<>Users>Terry.pa>Talks>IntervoiceLayers.IP!2 topMargin: 50.8mm, height: 152.4mm, width: 190.5mm

Conversation Management

· one or more parties per conversation

sources and sinks of voice

non-voice participants, e.g. workstations

· conversation establishment

state machine (idle, notified, ringing, ringback, active, ...)

parties act autonomously

notifications of state changes

· authenticated or presumed identities

· active parties use Voice Transmission Protocol

· broadcast reports during active conversation

encryption key distribution

recording started, playback finished, etc.

Conversation Management -- Challenges

· Novel conversation models

background calls, lectures, "watercooler" simulators

unsymmetric participation in (multiple) conversations

under workstation program control

· Workstation/telephone partnership

workstation decisions override telephone decisions

workstation tracks telephone-initiated activities

default behavior when workstation fails or drops out

· Reliability in the face of:

simultaneous actions by all parties

possibly-unreliable client code in workstation

distributed control, failures

real-time nature of the activity

Managing Recorded Voice

Goals: stored voice should be

· shareable

· editable

· available to diverse workstation clients

Realities: voice, unlike text,

· takes lots of space (64 Kbits/sec)

· cannot be printed or displayed

· requires special I/O devices

Recorded Voice Objects

· recording/playing voice sets up conversation between Etherphone and Voice File Server

· editing operations similar to those for character strings

Concatenation, Substring, Replace, etc.

exception: need to determine talkspurts and silence intervals

· editing operations produce new (immutable) voice objects

· operations performed on server

· voice shared and manipulated by reference

· voice data never copied or decrypted
(except when played)

Voice Storage

· two-level storage hierarchy: tunes and voice objects

many-to-many relationship between tunes and voice objects

· tunes

encrypted voice samples (block-mode)

stored on Voice File Server

· voice objects

sequence of tune intervals <tune#, start, length>

structure stored in database

· editing builds up complex voice object structures

premise: number of edits is small

Voice Interests

· garbage collection:

A tune is garbage if it is not a component of any voice objects.

A voice object is garbage if no client has an interest in that voice object.

· interests

similar to reference counts

grouped according to classes

stored in a database

· client operations

Retain — registers interest in a voice object

Forget — drops interest in a voice object

· class-specific garbage collection for interests

Sample Interest Classes

· TiogaVoice

interest registered when file copied to file server

interest information includes file name

voice object collected if file no longer exists

· Timeout

interest information includes timeout period

voice object collected if time since creation is greater than timeout

· WalnutMsg

only explicit garbage collection

interest registered when message received

interest dropped when message deleted

(scenario on sending voice mail goes here.)

Distributed Systems Issues
in the Voice Project

· Communication

Real-time voice protocol

Ethernet (or internet) transmission

64 Kbits/sec = 50 packets/sec

multicast for conference calls

Control via RPC

scheduled requests

multicast for reports

· Voice management

voice manipulated by reference

voice files vs. voice objects

· Security

voice DES encrypted during transmission

remains encrypted on Voice File Server

key distribution by secure RPC

· Fault-tolerance

Voice Control Server not replicated

rely on existing phone system for backup

· Naming and binding

by name, feep name, location, etc.