Subject: Voice videotape (in progress)
To: VoiceProject^
Cc: PolleZ
Due date in Boston: 24 Feb. Committee meets the 26 Feb. Rough cut at that time is ok, but need at least that.
How much emphasis on:
user interface
functionality
how it works (reasonably high-level)
Concepts
integration
programmability
telephone system
voice messages (with text and pictures)
Items to cover
items in [] will require hacking
scientific system description
include diagrams, possibly with (dynamic?) highlighting, of the systems architecture. Show distributed system etc.
Mention voice encrypted?
show hardware
Lark box, phone, speaker, workstation
server?
size of community?
inside of Lark box?
dial by name from keypad
how to show? just narrate / show Finch tool
ring tunes / harmonized ring tunes / callee ring tune
Finch icons: ringing with caller / conversation icon with caller/callee
[ais pictures of conversants for more visual interest? -- in icons or log or ???]
command tool: phone foo
use Finch to phone number/name selected from Walnut message or elsewhere
would call / answer/disconnect, as direct icon actions or as icon popup menus, help here?
directories (Doug's new ones with querying and merging?)
office number, home number
use to place call
redial last caller/callee
use Finch to answer/disconnect calls
note: could use speakerphone, as long as don't use audio from speakerphone callee
show some speakerphone use and some handset use => less viewer worry about noisy officeplace
conversation log while out of office
=> redial last one
[would like to be able to redial by selecting conv log entry]
feeping??
text-to-speech
from Tioga document (Finch doc? Adding voice b&w?)
from program (eg Remember)
from breakpoint (too technical?)
[would like control panel for synthesizer params]
[would like pause]
[would like to be able to play to others in conversation]
[read electronic mail]
voice messages in Walnut
play selected/all/out of order
[visual feedback would be nice]
have a picture in the message also
Etherphone logo? Xerox spiral? PARC?
Talk about voice file server -- voice not sent with message?
poaching
how to show? How interesting is it?
visiting
reasonably easy to show: issue visit command; different ring tune. Implies that most calls should be to the same person so that the viewers will note the difference? Maybe don't use harmonized tunes, use recognizable tunes instead?
(interaction with icons -- not enough space to say callee + caller)
future plans
conference calling
answering machine
large amounts of user programmability
user interface open issues
screen real estate, ....
[integration of conversation log with Walnut]
placing a call to someone pops open a Walnut sender, with appropriate fields filled in
may want to mail to self or callee
self as record of call
callee especially if no answer, possibly also as record
whitepages could include mail address for non-Xeroids also
receiving a call similarly
could have a ConstructCall button/cmd in directory, which will create a sender to call from
for filtering, fill in subject field and then initiate call
some access to standard/callee's set of filters?
with immediate message delivery, could be used as a focus for discussion in the conversation
may want to merge messages (caller & callee(s)) if both appear in a single Walnut db
Would a linear conversation log as now plus "automatic" Walnut messages be better?
Entries that have Walnut messages could have a pointer from the log to the message.
User profile entry could specify whether a Walnut sender would automatically open for each call, or whether you had to click somewhere to get one.
What if user workstation is off?
Great if receptionist could fill in and mail messages.
Outside voice message server would want to be integrated with same system.
[conferencing]
visual representation
"call from PolleZ (Swinehart Terry)"
pictures
program to place a call to someone?
[voice in Tioga documents]
visual cues
[voice editing]
[answering machine functions]
Script
this part is still quite sketchy....
opening: Etherphone logo with PARC picture
a) zoom in to workstation/transform to real workstation
b) text on screen is credits
Title: Voice Applications in Cedar (possibly on same view as logo instead)
select text & click SpeakText
=> spoken credits
could do different voices, use color monitor for credits
easy method: define cmdtool aliases for invoking different voices
[need to use translate procedures so that the spoken items sound tolerable (names, Cedar=Sehdar, etc)]
ordering 1: technical overview and description (see Jan Walker's expository purpose)
system overview with goals (using diagram)
hardware & software
functionality (using connected script of some kind)
summary & future directions/open problems
ordering 2: demonstration/advertisement
brief system overview with goals (spoken only)
functionality (using connected script of some kind)
hardware & software (?)
connected script to show off functionality
interaction 1: `phone' button, icons, ring tunes, conversation logs
get electronic message re voice system from some Etherphone owner
select their name and click `phone' (using speakerphone, can hear tune)
cmdtool button or Finch?
is Finch icon open or closed to start with?
split screen to show mouse?
show conversation log entry
show callee's phone ringing (tune and icon)
callee picks up phone (or speakerphone?), says hi whoever
split screen view of conversants? 4-way split for people & screens?
show icon change to conversation
open to show log entry?
talk about something about voice system
terminate call
show icon change back to telephone?
........
directories
voice messages
speaktext (inherent in reading credits?)
visiting
script, new version:
opening: Etherphone logo with PARC picture
a) zoom in to workstation/transform to real workstation
b) text on screen is credits
Title: Voice Applications in Cedar (possibly on same view as logo instead)
select text & click SpeakText
=> spoken credits
Here in the Computer Science Laboratory at the Xerox Palo Alto Research Center, we are developing an experimental voice and telephone management system. Our system is in daily use by each of the approximately fifty members of our lab.
Each office is equipped with a modified telephone instrument, a speaker, and a microphone, which are connected to a piece of custom hardware called an Etherphone.
The Etherphone serves all of the telephone needs of our users. Calls to other Etherphones are managed by a centralized switching server {workstation/computer}. The Etherphone hardware digitizes the voice, encrypts it (for security reasons), and transmits it over an Ethernet to the other Etherphone in a conversation. Each Etherphone is also connected to a commercial, direct-dial, office telephone line. This line is used for calls to telephones outside the Etherphone system. Furthermore, if the {experimental} Etherphone hardware or software fails, a deadman relay connects the telephone directly to the outside telephone line, so users are assured of at least ordinary telephone service.
Show Etherphone hardware.
<show diagram of Etherphone system here?>
Mention workstation (non-)connection? also Cedar, integration.
<All of this could be voice-over panning over screen, hardware, diagram>
0. in Polle's office: outside line => only phone in office; my ring tune; incoming icons
incoming call from outside line: Jock or Ken
screen: show icon ringing `from outside line'
<open icon and answer? don't really like as well. use speakerphone?>
Polle: Hello, Polle Zellweger here. (cut to person view?)
Jock: Hi this is Jock. I'm at home and I just wanted to wish you good luck on your videotape.
screen: show icon as conversation with outside line
Polle: Thanks. I should be home by 7. See you then.
Jock: Bye.
hang up
show icon becoming phone again
1. in Polle's office
<see Finch icon as phone>
electronic message from Doug
"ready to demo new directory stuff"
select sender name and click `phone' cmdtool button
using speakerphone, can hear tune
explain: consults public phone directory of Xerox employees in Palo Alto, located on server (need to know about server)
show icon as conversation with Terry
open and show conversation log entry (interaction w/ talking to Doug could be tricky)
grayed log entry may be totally illegible -- do reverse video instead
can also see entry from previous call, not active; maybe other calls as well
Doug answers (hi Polle - to motivate this, need to show his office? or just say how it works? previous call doesn't absolutely help because ringing was from outside line)
set up time for demo, say 2pm
hang up
show icon returning to phone
2. in Polle's office
"Let's have the system remind us to go to the demo"
screen: Remember to see Doug's demo at 2 pm. /audible
"The /audible switch will cause the remind program to speak the reminder, in addition to flashing a reminder window at us. The spoken reminder will use the same commercial text-to-speech synthesizer that you heard speaking the credits earlier. We have two synthesizers attached to our voice network that are available as servers: that is, they are timeshared by the user community."
<could demonstrate programmability by showing use of BreakTool to insert call to FinchSmarts.TextToSpeech[message]; probably too long, also too complex for this audience.>
3. in Polle's office
get electronic message from Stephen (or Dan?) re voice prints
text explaining the voice prints
voice ID 1: some sentence(s)
voice print of above
voice ID 2: some more sentences (music?)
either voice print of above or transformed print
would like color in this - his analyzed prints have grey levels now
have small telephone with smiley face near signature (color?)
[this requires some simulation: maybe use 2 mics or tape recorder]
4. in Polle's office (respond to Stephen/Dan's message)
send electronic response with voice message also
"Aren't integrated text, voice, and graphics wonderful? I've been waiting years for this!"
(want to see receipt of this message? use speakerphone or handset?)
4.2 in Polle's office
demonstrate use of directories:
call Time
or "flower shop": order flowers for Valentine's Day
could be busy the first time, then use redial
4.5 in Polle's office
time passes: zoom in on clock, fade to later time, zoom out from clock
or zoom in on clock, show clock advancing quickly, zoom out from clock
[can set clock: red=hr yel=min blu=sec shift=fwd non-shift=back ctrl=restore
can write small program to call Clock.ChangeOffset to advance to near 2pm;
doesn't help making the reminder go off. set remember icons!]
<maybe this will work best if advance really to 2pm? then can have issued a >
5. in Polle's office, near 2pm
(what am I doing? reading mail, working on a program, reading FinchDoc, editing personal directory, sitting at my desk reading?)
audio reminder happens
show screen reminder
5.5 in Polle's office, after 2pm
call Dan to let him know about the demo
a. use command tool: phone Swinehart
b. use directory: then can delete 4.2. But we're just about to see new directories -- confusing?
no answer
"oh well, I guess he'll have to see it another time"
6. in Doug's office
[simulate visiting: I am logged in @ Doug's machine, but it looks like he is...]
Polle enters.
"Hi Doug, I'm here to see the new version of the telephone directory that you've implemented. But first, since I'll be here in your office a while, why don't I make sure that my calls reach me here." {forward my calls to your office/tell the Etherphone system that I'm visiting your office}
"Some of the features we've implemented are aimed at improved and automated phone forwarding functions."
screen: Visit PolleZ password
response: PolleZ is visiting Terry at 2:03pm. PolleZ's calls will now ring at Terry's office and PolleZ's office.
Doug: Okay, Polle, I've built a database system for managing public and private telephone directories <simultaneously>. The public directory contains names and office phone numbers of all Xerox employees in Palo Alto. Private directories contain information about people I'm likely to call, including their name, office phone, home phone, and other remarks. I can add attributes to this database if I like, to include home address, electronic mail address, and so on.
Doug gets a call from someone. (Doug's tune)
"I'm busy right now, can I call you back later?"
Polle gets a call from Dan. (Polle's tune)
Polle: Oh, that's my tune; I'll get it. {Excuse me, they're playing my song; it must be for me.}
Polle: Hi Dan.
Dan: Hi Polle, I saw an entry that you called in my conversation log.
Polle: Yeah, I'm in Doug's office looking at a demo of his new directory package. Why don't you come join us?
Dan: I'd like to, but I have to go to a meeting. I'll come by and see it later.
now look at a few directory queries
Dan or Doug
Suppose I'm looking for someone whose first name is Dan, but I don't know his last name. I perform the query like this:
<*, Dan / match as pattern>
Soundex of "Peters"
Each database entry also has a Soundex key. Soundex encoding is essentially a simplified phonetic spelling that was developed at the turn of the century. Names that sound similar usually have the same encoding. Soundex has been used in many different applications, such as airline reservation systems.
In my directories, you don't even have to specify the full name. Let's say I'm looking for a person whose name begins with Peters, but I don't know how it's spelled.
<perform query>
Of course, once you have a set of query responses, you can place calls using them in the same way that we saw earlier.
<place a call?>
"Well Doug, it really looks like your system provides flexible capabilities for storing and searching through information about people. Thanks for the demo."
7. in Polle's office
screen: visit PolleZ password
"Now I'll tell the system that I've returned to my office. If I had forgotten, my phone would ring both places, so I don't miss calls. Since my calls would ring with my tune, Doug would know not to answer them."
<But: a) what if I stopped somewhere on my way back to my office [Unvisit while still at Doug's], and b) how do I remember that it's also ringing in Doug's office [some indicator]??>
8. in Polle's office
Although having a logged-in workstation next to an Etherphone allows all of the fancy screen interactions you've seen so far, the workstation isn't needed for the basic Etherphone functions of making and receiving calls. In fact, the Etherphone is not physically connected to the workstation at all; instead, there is a switching database that records which Etherphone is adjacent to which workstation. To demonstrate this, I'll now turn my workstation off and place a call. Incoming calls will still ring with my distinctive tune, and calls from my telephone will still be identified as being from me.
Turn off machine, pick up handset.
Since I now use directories all the time, I don't remember anyone's phone number. Luckily, I don't need to, because directories are available from the telephone as well as from the workstation. You just punch in the person's name using the letters printed on the keys. If you're very familiar with the keypad, you know that Q and Z have been left out; we just put Q on key 7 with PRS and Z on key 9 with WXY. I'll call my exercise partner, Maureen Stone.
call * 0 S T O N E #
Hi Maureen, I'll meet you in the gym in about 15 minutes.
9. in Polle's office
Closing.
Future stuff: conferencing, collection of voice input data for computer-human interaction studies, voice annotation of documents for tutorials.
Summary:
ways to place calls (could send this as electronic message)
1. select name/number & click `phone'
2. phone name/number in Command Tool
3. click directory entry (home/office)
4. type name/number into Finch called party field & click
5. if last person you called, click Finch called party field
6. if last person to call you, click Finch calling party field
7. dial name/number on keypad
could look at FinchDoc online
Notes:
tiny Etherphone logos on all hardware components?
change tunes to singletons: Close Encounters, some other easily recognized (Ades?)
Centrex forward all calls to somewhere else, Dealer/msg announce no calls during demo.
change Finch to reverse video active conversations in log.
change Finch to remove ".." for node breaks; "[+]" instead?
how to deal with DECtalk fan hum? I think there's enough cable to roll it outside my office. want to show DECtalk, server??
hack breakpoint into Remember
edit reminder to remove the /audible
set up profile for fancier Remember icons
make special personal directory to ensure security for my friends
Phone Slave tape
copyright 1984, shown at CHI 85, 5 minutes
no lecturing: all explanations of the system appear in messages left by callers.
interaction 1: Jeff Kiest calls in
shows only Jeff in his office
PhoneSlave answers
Jeff leaves long message
"I understand you are doing research at the Architecture Machine group on a telecommunication system that uses speech recognition to identify callers. I would like copies of any reports you have on this topic."
interaction 2: Chris Schmandt at his workstation reading his mail
color touch screen, integrated voice and electronic messages
touch screen means less trouble for videotape to see what's being selected
form shows Time Name Subject Phone Message
touch bars in the form encode the length of the message
animated playback in bar for voice messages, changes color to indicate read
one (electronic) message is from Barry Arons re DTMF decoders
Chris opens his directory (visual form: Rolodex) to look Barry up
screen representation of touch pad, handset
phone Barry = off-hook, auto-dial selected number, beeps are audible
Barry's phone is busy, so Chris leaves a voice message to be played when he calls next
"Thanks for the DTMF decoder specs, Barry"
interaction 3: Barry calls in
view of voice recognizer LED panel: "Barry Arons"
PS: "Chris got your last message, and he left you this reply"
plays the voice message that Chris just recorded
view of screen playing message
Barry records new message for Chris re dinner
interaction 4: Chris calls in from a remote (pay) phone
voice and command recognition
PS: "This is Chris's telephone. Who's calling?"
Chris: "It's me."
view of voice recognizer LED panel: "Chris Schmandt"
voice message (from Barry?)
t-t-s: "You have a message from x"
Chris: "What did he say?"
view of voice recognizer LED panel: "What did he say?"
"I understand you're using text-to-speech synthesis to read electronic mail over the phone"
Chris: "What time was it?"
view of voice recognizer LED panel
Chris: "Take a reply"
view of voice recognizer LED panel
Chris: "dinner on Monday would be fine"
....
Chris: "next message"
view of voice recognizer LED panel
electronic message from someone
t-t-s: "You have a message from x. It's a long message."
Chris: "What's it about?"
view of voice recognizer LED panel
Chris: "next message"
view of voice recognizer LED panel
voice message from someone
"Hope things are going well with your videotape. Looking forward to your presentation at the conference."
interaction 5: Jeff Kiest calls back
view of voice recognizer LED panel: "Stranger 1"
PS: "Chris got your last message, and he left you this reply"
"I'm working on a videotape today. I'll be back in the office tomorrow."
leaves new message
"Good luck with your videotape."
Videotape guidelines
from Jan Walker
1. Clear sense of purpose.
Several possible purposes for making a tape (pick one and try to satisfy its requirements primarily):
- demonstration/existence proof
- advertisement
- didactic (tutorial)
- technical communication (expository)
I think these conference tapes want to be technical communications but
many of the tapes have mixed purposes. Many are a funny mixture of
existence proof and tutorial!
2. Establish context.
- who/where/what/why/machine
- problem or issues being addressed
- unusual assumptions or terminology
3. Production basics.
- adequate lighting
- in focus
- sound quality
- adequate levels
- no background music during technical discussion
- no sotte voce instructions to the camera person
4. Production fancies (i.e. contribute substantially to communication)
- "split" screen
----------------
  | screen |
  | |
  ----------------
  | keybd | mouse|
  ----------------
- "subtitles" (very effective summary and focus mechanism)
5. Communications principles, much like standard scientific papers
- purpose
- principles on which work is based
- how each principle is reflected in the system
- final integration and summary
6. Pacing. Too slow is as common as too fast.
7. Exposition style. Boils down to "focus".
- always say what you are going to do before doing it
- tell viewer where to look and what to look for
- draw attention to significant events if several things happen
- make the point once effectively; avoid theme & variations that drag
8. Presentation style
- speak instead of reading (or maybe use professional actors)
- keep voice topic "in synch" with visual activities
- minimize talking head
- seek variety: face <-> screen <-> hands <-> "slides"
- keep asking "why a tape instead of a talk?"
from Maureen & Ken
one person drive mouse and another person talk