Heading:
Voice vs. Data: Internet Issues
Page Numbers: Yes X: 527 Y: 10.5"
DRAFT - DRAFT - DRAFT - DRAFT
DRAFT - DRAFT - DRAFT - DRAFT
Inter-Office Memorandum
ToFileDateJuly 24, 1981
FromL. StewartLocationPalo Alto
SubjectVoice vs. Data:OrganizationPARC/CSL
Internet Issues
XEROX
Filed on: [Ivy]<Audio>LyDoc>VoiceVsData.bravo
Abstract
Real-time communications, and in particular, packet voice, put substantially different demands on a datagram based internet. Relevant facts about the requirements of voice communications are presented. The differing needs of real-time (voice) users and data users are discussed. Some possible ways of managing the operation of a combined voice and data internet are described. A concrete proposal for incorporation of voice and other real-time applications into OISCP is made.
Facts about voice and telephony
Medium-bandwidth
Telephone quality voice can be achieved with rates of 8000 bits per second or so, but at this writing, the required techniques are computationally expensive. Intermediate bit rates are a possibility, but 64,000 bits per second represents the present telephone industry standard. For this reason, we restrict our attention to 64 Kbps telephone industry compatible speech. Such digital voice signals consist of sampling the voice 8000 times per second and representing each sample as an 8 bit encoding of the amplitude of the voice.
Real-time
Voice communication from human to human (telephony) is a real time communications problem. The perceived delay must be fairly small and constant. Tolerable delays are generally below 100 milliseconds. [Notes on the Network].
TASI advantage
While a conversation between people is usually full duplex (both people can talk at once), usually only one participant at a time is speaking. In addition, when a person is speaking, there are often gaps between words or sentences. On the other hand, both partici[ants occasionally speak at once. Over conversations in general, something like 47% of the full-duplex channel is used.
The laws of large numbers apply to these statistics. Useful data points derive from the telephone industry use of Time Assigned Speech Interpolation (TASI), in which a certain number of trunk circuits (e.g. transoceanic cable circuits) are overcommited. If 24 full duplex trunks are available, usually 36 conversations can be supported, for a ratio of 1.5. If 150 circuits are available, 300 conversations can usually be supported, for a ratio of 2.0. [Notes on the Network, BSTJ] These statistical effects are usually referred to as the TASI advantage.
Error-tolerant
To a certain extent, the human ear is tolerant of distortion in speech. For digital speech this means that, to a certain extent, the ear is tolerant of errors in the digital representation of speech.
Consistant service
Consistant service is perhaps the wrong title for this concept. Once set up, a voice connection should maintain an adequate quality. In the presence of network overloading it would be better to reject (block) connection attempts altogether than to offer poor quality. (A corollary to this is that it is certainly better to block new calls than to degrade old ones.)
Other possible real-time applications
Connection to non-flow controlled data circuits
Suppose a medium banwidth (9600 bits per second to 56,000 bits per second) asynchronous (start-stop) serial line with no flow control is connected to an OIS gateway. If the internet side of the gateway cannot keep up, the gateway buffers will eventually overflow and data will be lost. This example exists today with the Research Internet Data Line Scanner (DLS); if a user of the dial-out services cannot keep up with the speed of the DLS line, data is lost.
This example is not strictly real-time. There are no particular delay requirements, but there is a bandwidth requirement. As a somewhat contrived example, suppose an internet is used to link two such non-flow controlled circuits (of the same speed). The delay introduced by the internet (by buffering and transit delay) can only increase. Once capacity on the outgoing circuit is left idle, the time lost can never be made up.
Connection to slow speed printers
It might be desired to transmit the bit-map for a raster printer in "real-time" through an internet, with some finite (less than full page) buffering at the exit from the internet.
The general idea of a real-time protocol
Suppose there is a producer of data for the real-time application that delivers data at a constant rate. The data is collected at the originating end until a full packet is accumulated. The packets are sent (at a constant rate) to the receiver, where the data is doled out (at a constant rate) to the consumer of data. Some amount of data, perhaps less than a packet’s worth, perhaps more, is buffered at the receiver to smooth out jitter in the arrival of packets.
Naturally there must be adequate average bandwidth to suppor the application. There must also be sufficiently low variation in the rate at which packets arrive at the receiver so that the receiver’s buffer never becomes empty.
For a printer application, an empty buffer might mean a missed scan line. For a voice application, an empty buffer might mean a momentary hiccup in the conversation. Probably neither case is absolutely catastrophic, but the such hiccups must not occur at more than some acceptable rate.
To the extent that the types and numbers of errors in the internet are tolerable to the application, a real-time protocol does not need acknowlegements or retransmissions.
A voice protocol, in order to benefit from the TASI advantage, might detect periods of silence and utilize reduced bandwidth while silence is present.
The needs of real-time vs. data users
Some general principles:
1) Data users must make forward progress.
2) Voice users must get the bandwidth they need, or none. It is better to block a phone call than to offer a bad connection. (Similar principles hold for other real-time applications.)
To illustrate the operation of the first principle, we offer two examples, the Ethernet and point to point links between gateways. On the Ethernet, everyone is equal. Generally speaking, when the load is low, everyone gets the bandwidth they need and the issue is moot. When the offered load exceeds the bandwidth of the net, the contending users share equally [Hupp & Shoch]. (In fact, the contending users share the actual bandwidth in proportion to the rate at which they become ready, see elsewhere.) For the point to point line case, the current gateway program allocates the line first come first served, and maintains a queue of packets to transmit on the line. If the queue exceeds a certain limit, packets are dropped. By symmetry, everyones packets are dropped with equal probability. (Actually, gateways promote small packets; by generating great numbers of small packets, a client could get 100% of a phone line, locking out other users. I think this is a bug.)
The effect of all this from the standpoint of a data-only internet, is that even when the communications capacity is greatly overcommited, all users get at least some of it, thus all users make forward progress (if you wait long enough, your file transfer will finish).
Real-time communications (including voice) are fundamentally different. Once a connection is set up, it should get the bandwidth it needs. It is better to refuse service altogether than to use internet bandwidth providing poor service. Consider what would happen if an "equal-sharing" network were slowly loaded with telephone calls. As the first users pile on, everything works fine; there is enough capacity for all. At some point, the demand exceeds the supply and the sharing property of the network allocates the available bandwidth equally to all contending users. All the telephone calls fail at once! It would have been better to refuse service to the "last-straw" phone call, thus limiting the outrage.
These matters can be interpreted as optimizing an objective function. The objective function for data users might be the sum of the logarithms of the bandwidths per user: more bandwidth is better, the channel is shared equally, and getting zero bandwidth is infinitely bad. The objective function for real-time users might be the sum of step functions with the various jumps at the required bandwidths for the various users: more than a certain amount is ok, less than that amount produces nothing. The maximum of this function is achieved by allocating the requested bandwidth to each user until capacity is reached, other users get nothing (but may try again later). It is not clear how to combine voice and data users within this model without adding information on the realative worths of data and voice.
There is an interesting analogy here between data vs. voice users of communications and time sharing vs. personal computers. The capacity of a time shared computer is allocated equally (usually) among contending users, the capacity of a collection of personal computers is allocated in "sufficient size" chunks up to the limit of the number of computers and none thereafter. In one case the advantage of adding capacity is that more people can work, in the other, everyone can still work, but their work gets done faster.
Traffic Engineering
In the telephone industry, there is the notion of probability of blocking. Given a certain number of physical trunks between A and B, and certain statistics of the numbers and durations of calls placed, there will be a certain probability that all the trunks will be busy when a call arrives. Traffic engineering is the the business of providing enough trunks so that the probability of blocking is acceptably low, subject to economic constraints. (Typically, users are charged more during periods of high load than at other times. This tends to even out the loading and raise the average utilization of the trunks.)
So far in our construction of internets, the notion of traffic engineering is one of persuading an organization to invest in capacity when their data communications become too slow rather than on any objective grounds.
The advantage of combined real-time (voice) and data networks
How can the differing needs of voice users and data users be reconciled? In general, the system must recognize that the classes of users have differing needs and apply different "objective functions".
Proposals
Managing the bandwidth of a point to point line
Managing the bandwidth of an Ethernet
Principles
Load control -- statistical, explicit
Hints
Internet Issues
OISCP issues
Class of Service proposal for OISCP
2. Class of service. We will need some bits to use for internet voice. Consider the type-case of two 10 Mbit Ethernets connected by a point-to-point 1.5 Mbit link. There is plenty of bandwidth around, but it is not infinite. My proposal is for a "real-time" bit in the class of service field, with a few bits for "how much". The how much field might be the log of the required bandwidth or something. (We might want a special bit for 64Kbit voice, see below "TASI").
A pair of routers connected by a 1.5 MBit line would have a parameter indicating that up to 1 MBit of line capacity may be used for voice (or other real-time), with the remainder reserved for data. When there is less than 1 Mbit of real-time traffic flowing, the idle capacity can be used for data datagrams: (and the data queue empties faster), but when there is real-time traffic around, it gets reserved capacity. The routers keep an eye on packets coming in. Suppose the router sees a real-time, how-much=64 Kbit packet for a new source-destination pair. The router takes this as a hint that a new "stream" is being set up and makes a table entry "reserving" capacity for the connection. By using the how-much field together with the packet length, the router can predict when the next packet of the connection is expected. The table entry can be deleted (timed-out) if the next packet doesn’t show up. (Thus there is no "stream setup" protocol, it is all done with hints.) When it happens that the n-th+1 apparent stream shows up, the router drops the packet and sends an error replay "no capacity now". What must happen in a system like this is that:
1) Data users must make forward progress -- some capacity must be left for the data users.
2) Voice (or other real-time) get the capacity they need or none at all.
3) Already set-up real-time connections do not suffer as more users pile on. It is much better to refuse a ’connection’ as above than to screw up everybody elses phone calls by overloading the link.
It may be necessary to have an explicit voice bit (with "64-Kbit" subscript) rather than just "real-time". A typical phone call uses each half-duplex path slightly under 50% of the time. Overseas cable is overcommited for this reason. A "Time assigned speech interpolation" unit assigns you a trunk only when you are talking. Thus 24 actual 2-way trunks can carry 36 conversations or 150 actual trunks can carry 300 conversations. (The laws of large numbers are not fully in gear with only 24 trunks.) I envision a speech connection would send 50 160-data-byte packets per second while talking and would also send small packets at a lesser rate during silence in order to let the routers know (via the hint mechanism) that the ’connection’ was still there. The router could actually get away with allowing, say, 20 ’connections’ over the 1 Mbit of capacity rather than only 16. Only for brief periods would the offered load from the 20 conversations exceed 1 Mbit.
c: COPYTO
yyy