EtherVP.bravo
Stewart September 9, 1981 5:49 PM
Ethernet Transmission
Precis: While we do not fully understand the details of Ethernet behavior in a very high load regime, we are very confident that the network behaves very well (low delay) up to sufficient load to build a usable Etherphone system. We intend to incorporate load management into the system to insure that the Ether does not become overloaded.
Ethernet considerations
The access delay of an Ethernet depends on the number of stations waiting to transmit at the end of some other transmission, be it a successful packet or a collision. The delay (including the average time remaining of a transmission in progress when a new station becomes ready) is complicated, but it is very small if ther is only one station desiring to transmit. In this case, the delay is only that of waiting for a previous transmission to finish. Obviously, if an Ethernet were perfectly scheduled, this condition would persist up to 100% load. Since an Ethernet is not scheduled, statistical fluctuations and retransmission delays result in several stations waiting to transmit on occasion even though the average is less than one. Measurments by T. Gonsalves of Stanford, while not exactly applicable to Etherphone, seem to show that the delays remain very low until the load reaches 60% to 80% of capacity. The exact position of the "knee" in the curve depends on the packet size in use.
Interactive voice (telephony) requires low delay, under 100 milliseconds and preferably well under. In a datagram system, this requirement may be translated into packet size. Shorter packets mean lower end to end delays since less time is spent w in accumulating a packet. In an Ethernet, shorter packets are less efficient thatn longer ones for two reasons: shorter packets have relatively more overhead and less data, and shorter packets have a lower "knee" in the delay curve.
Interactive voice considerations
As was mentioned, interactive voice requires low delay, but not necessarily perfect reliability. The ear will not notice occasional garbles. Telephone quality voice is readily achieved by sampling the voice waveform 8000 times per second and encoding the sample in 8 bits, for a data rate of 64,000 bits per second. We are planning a system design of roughly 50 packets per second (160 voice bytes per packet). This figure gives 30 - 40 milliseconds end to end delay. The delay is made up of 20- milliseconds packet assembly time, plus a minimum transmission delay, plus some "anti-jitter" buffering delay at the receiver to cover up momentary longer delays.
The figure of 64000 bits per second is a compromise in many ways. Higher rates would give higher voice quality at the expense of extra Ethernet bandwidth per conversation and higher voice file storage costs. Lower rates could be achieved by speech compression, but would not greatly improve the capacity of an Ethernet. The number of packets per second needed for an interactive conversation must be kept high to minimize delays. Required Ethernet bandwidth would be reduced by compression, but the efficiency would drop as well, due to the shorter packets.
Silence detection provides an opportunity to improve the voice capacity of an Ethernet. Usually only one of the two parties of a phone call is speaking. The peak capacity must be sufficnet to allow both parties to speak at once, but sometimes neither party will be speaking. The average is around 50% utilization of the full duplex channel.
Bottom line:
10 Mb Ethernet -- ~ 400 telephones
3 Mb Ethernet -- ~ 150 telephones
Why:
Only 22 % telephones busy during busiest hour (This number via Lynch)
Silence Detection -- 50 % advantage (A little more, really)
Ether utilization selected to maintain low delay (Short packets mean lower delay but more packet overhead. Also, Ethernet starts to behave like slotted Aloha for short packets.)
Assumptions:
(More would give higher quality)
Anyway:
Nothing wrong with different Ethernets for different communities. Internet!
Transmission Issues
Compression: -- We don’t need it.
State of the art: 8000 bits/second, but hard
Doesn’t help Ethernet utilization much -- short packets are mostly overhead.
Plenty of disk capacity:
One T-300 => 9.5 hours of voice.
( 50 people, 11 minutes each! )
Packets per second limits usually get you before bits per second limits
Echo supression:
Some outside calls may need echo suppression/cancellation
LSI echo cancellers exist.