By: L. Stewart
Date: June 8, 1981 5:17 PM
I. Ethernet KTS
1. What
We propose to build a two to five station Ethernet key telephone system. Each station would consist of an Alto I with an Auburn audio interface but no disk or display (An Etherphone I). The audio hardware would connect both to local audio devices (a telephone handset, a speakerphone, a mike and speaker), and to the telephone company office telephone line.
There would be a mechanical switch to bypass the Etherphone and connect the telephone handset directly to the line. This switch would normally connect the line to the Etherphone, but could be thrown in case of panic or debugging to leave the telephone working.
The basic idea of the Ether KTS operation would be to place telephone calls between Etherphones by transmitting the voice over the Ethernet, but placing calls to other non-Etherphone locations by dialing out over the standard office line. Once the basic transmission facilties are in place, we can begin exploring the voice universe by slowly incorporating value-added functions to the system.
A key ingredient of the design is 100% availability of telephone service.
2. Why
Why the Ethernet KTS? This question is really a very deep one, and involves both long range choice of research goals and shorter range pragmatics.
There are two sub-areas to CSL’s involvment with voice: telephony and system integration. Roughly speaking, telephony has to do with the transmission of voice data and with elementary control functions of placing calls. Integration has to do with advanced control facilities such as use of a data base to store numbers, and with the manipulation of voice data in cooperation with our other activities: voice Laurel messages and voice annotation of documents. Telephony does not greatly depend on integration, but integration depends a great deal on telephony. In order to explain our choice of the KTS, let us briefly examine three scenarios for audio in CSL: use of the centrex system, control of a PBX, and ``Etherphone.’’
Centrex
Centrex refers to our existing telephone system. It provides a single direct dial number for each office and has a few value-added features, such as call-forwarding and an attendant console. (One can forward calls to another number and one’s phone transfers to the attendent automatically after three rings.)
Several years ago, a device called the Ross Box (after Bill Ross) was available. This device connected to the Diablo port of an Alto and permitted the machine to pick up and dial one’s office phone. In addition, an early version of the Alto audio board was used to build a simple voice message system. One could call a special number, tap out someones initials, and leave them a message. Put together, these components, or more modern versions thereof, provide quite a bit of functionality. Basically the ross box equivalent provides a way of using all of our data base knowledge to look up phone numbers and dial them for us. (Instead of selecting ``Answer’’ to a Laurel message, select ``Call back.’’) The audio board device, in small numbers, provides a way of getting voice on and off the Ethernet, where our programs can work with it, file it, annotate with it, and so on. The A/D and D/A conversions are provided by a server, rather than by one’s own machine.
This system provides the capability for a number of really impressive systems: voice messages, voice annotation of documents, semi-automatic call placement, and so on. There are also some crippling disadvantages: our control of the operation of the voice transmission is limited and uncertain, and an important resource, one’s telephone, is tied up for extended periods. The second problem is really a consequence of the first. In the centrex system, the only controls available over the control of the telephone system are obtained by electrically picking up the office telephone and electronically generating beeps. This can be made to work, but is a slow and uncertain process. The progress of an attempted call is determined by the return of various noises over the voice path: ringing, dial-tone, busy, reorder, etc. It is quite difficult for a machine to sort out these noises; they cannot be ignored because calls do not always get through, a line is busy, the network fails, etc. In addition, the placing of a call requires several seconds: one or two to dial the call, perhaps one for the system to connect a local call, and as many as six seconds for ringing to be detected at the destination. This means that for applications such as annotation of a document, one’s office phone is effectively ties up. Callers will get a busy signal! Placing and taking down calls every few seconds to try and avoid this circumstance might make the phone compnay very unhappy. Telephone exchanges allocate common equipment based on human frequencies of call placement. Machine speeds would upset the phone company’s ``traffic engineering.’’ We might obtain ``call-waiting’’ or similar functions from the phone company, but again, it would be hard for our machines to recognize the associated ``beep’’ and almost as hard to handle the situation in a reasonable manner.
Of course we could buy a second telephone line for everyone but the speed and uncertainty of call placement would still be with us.
Control of a PBX
In this scenario, we replace our present telephone system with a commercial PBX, and use a computer to control the operation of an otherwise standard telephone switch. In this system, D/A and A/D conversion functions for manipulation of voice-as-data is done by a server, and call placement may be done either by manually dialing one’s telephone or by having one’s workstation instruct the PBX (possibly through another server) to place a call.
In this system, while calls outside the PBX would still be slow and uncertain, calls inside would be very fast and we would have easy access to the state of the system. If we want to know if a number is busy, just (digitally) ask the PBX. It would be possible to connect to a server for just a few seconds to record a voice annotation, while still remaining open for incomming calls. We could instruct the switch to do just about anything, such as forwarding calls after leaving one’s office, rather than before. At base, these functions are available because the phone system would be entirely under our control. We would still require an adequate number of servers to meet our D/A and A/D needs.
The key disadvantages of this system is that we do not have such a controllable PBX (getting one would cost quite a bit), and before installing such a system, we would have to negotiate control of the switch with the vendor.
Etherphone
The basic premise of the Etherphone approach is that actual transmission of the voice data is done in digital form over the Ethernet. There are many variations, but the eventual system might provide each CSL member with a 20-40 chip microcomputer based telephone interfaced to the Ethernet. Connections to the outside telephone world would be done by servers with trunks to the phone company. This scenario has the advantage of complete control over the telephone transmission system. We benefit by the natural multiplexing of the ether and by direct access to voice-as-data. Control of the system is distributed; negotiation for a call might take place directly between the source and destination Etherphones.
The disadvantages of this system are the uncertainties of Ethernet voice (not too serious), and the major fact that a 20-40 chip Etherphone cannot be built at least until a single chip Ethernet controller is available.
Discussion
Well, where did the KTS proposal come from? It is a first step on the route to a full Etherphone system. We feel that the centrex option, control of the existing phone system, is unacceptable because it does not offer sufficient reliable functionality. We feel that the PBX route, control of a commercial telephone switch, is impracticable for us, because we do not have one. (However, someone ought to do it!) The third alternative, Etherphone, is difficult to pursue now because it is expensive to build an Etherphone today (although that will change).
The Ethernet KTS proposal is a combination of the centrex and Etherphone scenarios. By building a few expensive Etherphones now, we can give a few people all the benefits of the Etherphone and develop all the required protocols and work on applications while at the same time working towards our true goal of the 20-40 chip Etherphone for everyone. In addition, the KTS idea, wherein all the clients retain their original phone lines, avoids the problem that not everyone has an Etherphone. Noones view of the phone system need change; the same 4-digit number still works, but for those with Etherphones, many value-added functions become available.
What about slight variations of this scheme?
One obvious way to avoid the expensive separate Etherphone is to place audio hardware in our workstations. This approach is probably fine for annotation of documents, but our workstations are not designed for 100% availability (You can’t get calls while you are in the debugger.) and they are not designed for real-time performance (Your call to your friend breaks up because the collector starts running.) Basically, if we want to use the system while a special program is running then workstation audio hardware is fine, but we can’t build a telephone system that way -- it has to work all the time.
One way to avoid using up Alto I’s is to construct stand along Etherphones out of commercial 16 bit microcomputers. At the present time, both the processor and Ethernet would be full boards, the audio hardware would be a few extra chips, and a fairly bulky power supply and cabinet would be needed. On top of that, we would probably not have a very good development environment. Our early efforts would be greatly diverted by hardware and software development struggles.
Providing workstation audio hardware may still be a good idea. THe proposed auxiliary board for the D-machines could include simple audio hardware for a small additional cost. While users of such hardware would not have full integration of all audio functions, they would be able to use voice messages, annotation, and so on by running special programs.