Page Numbers: Yes X: 530 Y: 10.5" First Page: 22
Columns: 1 Edge Margin: .6" Between Columns: .4"
Margins: Top: 1.3" Bottom: 1"
Line Numbers: No Modulus: 5 Page-relative
Heading:
PUP: AN INTERNETWORK ARCHITECTURE
4. Evolution, actual experience, and performance
The Pup architecture emerged against a background of Arpanet protocols. Many of its important ideas—and those of its key relative, TCP—first appeared during the course of a series of meetings of the International Network Working Group (IFIP TC-6 WG6.1) during 1973. Pup and TCP share a number of important principles, most notably that of reliable end-to-end transmission through an internet. Pup subsequently diverged from TCP as the desire for implementation within Xerox required decoupling it from TCP’s long and sometimes painful standardization process.
The fundamentals of the Pup design crystallized in 1974 and have remained essentially unchanged since then. During this interval many higher-level protocols have been developed, the implementations have evolved considerably, and the internetwork system has grown to include approximately 1000 hosts, attached to 25 networks of 5 different types, using 20 internetwork gateways. The system is in regular use, is quite stable, and requires little regular maintenance or attention.
From a functional point of view, this internetwork architecture has been able to fulfill the needs of a very diverse community. While the bulk of all traffic is carried by means of a few standard protocols, it has proven extremely valuable to be able to define new protocols—aiming at different points in the space of performance, cost, and functionality—and to fit them into the internet protocol hierarchy at any of several levels.
In terms of performance, the internetwork gateways impose very little overhead because they are so simple. In regions of the internet where multiple high-bandwidth local networks are interconnected directly by a single minicomputer-based gateway, there is almost no noticeable difference between intranet and internet performance. Total throughput in an individual gateway is high, ranging from 400 to 1000 kilobits per seconds (depending on the particular implementation), and the typical delay experienced by maximum-length Pups in the case just mentioned is 2 to 5 milliseconds.
These figures don’t represent limits to what is achievable—even with the relatively low-powered machines now being used as gateways—because the gateway software has not been highly tuned for this application but rather is based on general-purpose software packages that are also used in many other hosts. But the current performance is adequate because the internetwork traffic load is typically only a tiny fraction of the capacity of the underlying local network channels. There exists one Alto-based gateway that interconnects three 3-megabit per second Ethernet channels as well as several 9.6-kilobit per second leased lines and a Packet Radio interface. In general the bottlenecks are not the gateways but rather the slower communication channels; discard of Pups due to congestion in gateways is almost exclusively due to overload of the 9.6-kilobit per second lines.
As might be expected, most of the traffic in our local networks is intranetwork—that is, consisting of Pups whose source and destination are on the same network. For example, measurement of one such network has shown a typical volume of 2.2 million packets per day, 72 percent of which are intranetwork packets [Shoch & Hupp, 1979]. Furthermore, of the remaining 28 percent, more than half consist of traffic to or from another nearby local network connected via a single gateway. (This site is served by multiple local networks because it is too large to cover with a single one using existing Ethernet technology, and also because it would exhaust a single network’s address space.) The rest of the traffic—some 250,000 packets per day—is transported to or from other campuses in the internet, mostly via the leased line network.
The higher-level protocols, such as the Byte Stream and FTP, are generally limited in performance by the processor capacity or the secondary storage bandwidth at the source and destination. For example, our BCPL implementation of BSP can maintain a data stream at the rate of about 500 kilobits per second between end processes running on Alto minicomputers, at which point both machines are CPU-bound. While it is certainly adequate for most applications, we find this performance somewhat disappointing, and we view it as an indication that BSP—though substantially simpler than, say, TCP—is still too complicated a protocol for high-performance communication.
The Pup architecture allows individual networks to be added to the internet system on an ad hoc basis, with no need for central control or coordination except to assign new network numbers. Users sharing a local network can assemble gateways and lease lines to other nearby gateways; they are encouraged to make multiple intergateway connections to provide alternate routes and thereby reduce the probability of being isolated. The gateway software has evolved to the point where if one starts a copy of it on a host having at least one connection to the existing internet, it will automatically obtain the files and other information it needs, announce its availability to the rest of the internet, and begin forwarding Pups.
5. A retrospective critique, possible improvements, and future research
While the architecture works extremely well, there are some lessons to be learned from this experience.
5.1. Addressing and routing
The size of address fields is a question of continuing controversy. An eight-bit network number supports up to 256 nets; that is fine for now, but eventually it should be made larger. To date, 256 hosts per net has not been a problem, though it is likely to become one (for example, when the Arpanet’s new 24-bit addressing convention starts to receive wide use). We have avoided variable-length address fields in the Pup design because they increase per-packet processing costs.
If an internetwork system becomes extremely large, the number of networks becomes so great that it is no longer practical for all hosts to keep routing table entries for all possible destination networks. Area routing strategies may be employed to attack this problem [McQuillan, 1974]. Alternatively, one may adopt a scheme in which the local routing table becomes a cache of recently used routing information, with routes to specific networks computed and maintained as needed. The problem of locating routes to distant parts of the internet is an area of current research.
One could consider revising the entire notion of a hierarchical address space. Under the current design, it is sometimes necessary to change the host number of a machine which is moved from one net to another—an operational annoyance. It is conceivable that every host could be given a unique address within a flat address space; a more sophisticated mechanism would then be needed to map addresses into routes, since there would no longer be a network number as part of the address (except perhaps as a hint, to improve performance).
We view with some disfavor non-hierarchical organizations in which internet addresses consist of a concatenation of network-specific addresses [Sunshine, 1977b]. Such arrangements have the effect of fixing the path to a given destination and blur the distinction between addressing and routing.
Socket numbers, which are now 32 bits wide, could easily shrink to 16. Originally, 32 bits were assigned to allow inclusion of a unique subfield to distinguish among multiple instantiations of a connection; we now recognize that a better approach is to use a distinct connection identifier at the time a connection is established, as mentioned earlier in the presentation of the Rendezvous & Termination Protocol.
Using hop counts as the metric for routing decisions has worked remarkably well. An obvious drawback, however, is that it considers a hop through a 9.6-kilobit per second phone line equally as good as a hop through a 3-megabit per second Ethernet link. As the topology becomes more richly connected, this will increasingly become a problem. We intend eventually to change the routing algorithms to reflect some consideration of bandwidth and delay, using known techniques based on research into adaptive distributed routing algorithms in the Arpanet and elsewhere.
We have given little consideration to source routing or other forms of advice (e.g., class of service) provided to the internet routing procedures by source processes. In providing such facilities, one must take great care not to compromise the simplicity of the basic internet datagrams or violate the layering of protocols.
5.2. Congestion control and utilization of low-bandwidth channels
The current congestion control techniques must be regarded as primitive. Discarding Pups and (where possible) notifying the source process when congestion occurs has the virtue of simplicity, and we believe it is a good general approach; but the present design has several defects. Insufficient information is returned to the source process to enable it to make an informed decision about how to proceed; further, the discard of Pups is haphazard, and no provision is made for fairness. Congestion occurs most often at the entry to slow channels, and under overload conditions the perceived performance of paths through those channels is highly variable.
This is a situation in which it would be appropriate to perform a relatively large amount of computation per packet in order to optimize the utilization of the communication bandwidth. For example, the network-specific driver for a leased telephone circuit could examine the source and destination addresses of Pups to deduce the existence of ‘‘conversations’’, and use this information to share the slow channel more effectively. (The Arpanet IMPs deduce conversations in precisely this way, though for purposes having to do primarily with flow control rather than congestion control.)
In the same vein, techniques such as code compression, elimination and regeneration of identical internet headers in successive packets, etc., may be implemented in the network-specific drivers for the slow channels, with minimal impact on the end-to-end protocols. Such techniques are used widely in virtual circuit designs, and their applicability is sometimes cited as an advantage of virtual circuits over datagrams [Roberts, 1978]. But there is no reason they can’t be employed in a datagram-based internet, so long as the necessary additional computation is done in the right place.
The important point is that optimizing the utilization of the communication channel is appropriate only when the channel bandwidth is scarce compared to the computation required to perform such optimization. Where the processing capacity of the end machines is itself the scarce resource—as we have observed in the local network environment—such techniques are highly inappropriate.
5.3. Pup types in the internet header
The distinction between registered and unregistered Pup types at the level of internet datagrams has not turned out to be particularly useful, except in a few cases: Pups of type ‘‘error’’ and ‘‘trace’’ may be generated from within the internet without knowledge of the higher-level protocols being employed by the end processes.
5.4. Performance of reliable end-to-end protocols
Present implementations of the Byte Stream Protocol include fairly sophisticated adaptive flow control heuristics that also try to take note of any packets lost due to internet congestion. This approach has worked reasonably well in enabling a source to adapt to the conditions encountered along the path to a particular destination. However, use of networks with highly variable behavior—such as the wide-ranging delays experienced when using the Packet Radio network—can confound these heuristics. Under unusual circumstances, the flow control procedures have been observed to move suddenly into very unfavorable operating regions. The difficulty involving the Radionet has since been solved, but the general design of simple, effective flow control and congestion control procedures is just a very hard problem—particularly procedures intended to adapt dynamically to and make good use of different networks whose performance may vary by nearly three orders of magnitude.
The step from raw Pups to a byte stream may be too large. The Byte Stream Protocol does too much for many applications; it is complex enough that few systems have ever implemented the entire specification. As discussed previously, performance of the BSP—when compared to some other systems—is reasonable; but it does not give a user the full capacity of the underlying networks. In a high-bandwidth local network environment, paying attention to per-packet processing overhead is of extreme importance.
We have considered—but have not yet implemented—a proposal for an intermediate level of functionality: a Reliable Packet Protocol (RPP) that takes care of connection establishment and processes flow control information, but tries not to dictate how a client program should do buffer management. It ensures reliable delivery (i.e., each packet once and only once), but may deliver packets to the client out of order, and does not deliberately attempt to hide packet boundaries. A BSP connection—where that is what is desired—may then be re-implemented as a veneer on top of an RPP connection.
5.5. Access to the internet
The present Pup architecture can be characterized as ‘‘open’’: users and applications are permitted—and indeed encouraged—to take advantage of the internet for routine communication. Access to the internet is uncontrolled; as in many network designs, responsibility for access control rests with the host systems, and whatever accounting is performed is for the services rendered by individual servers. In our research and development environment this is ideal, but obviously in some other environments it might not be.
5.6. Conclusions
The success of Pup as an internetwork architecture depends on a number of important principles. Key among these is the layering of function in such a way that applications may make use of the internet at any of several levels, with the ability to choose among alternative protocols at each level or to develop new ones where necessary. Simple internetwork datagrams constitute the level at which media independence (through encapsulation) is achieved; they are also the unit of direct process-to-process communication. This is crucial both to flexibility and to performance, particularly in an internetwork environment dominated by relatively lightweight hosts and high-bandwidth local networks.
During 1976 the Pup internet reached a level of functionality roughly equivalent to that provided by the standard Arpanet protocols—byte streams, Telnet, and FTP. From that time to the present we have concentrated on building servers and constructing applications to access them through the internet. We are just beginning to explore that area of interprocess communication traditionally considered the domain of multiprocessors. Some interesting opportunities arise from the availability of 100 or so minicomputers interconnected by a 3-megabit per second broadcast channel, and by 10 or so similar clusters, all interconnected by a store-and-forward network. We believe that the Pup architecture serves as a good foundation for such investigations.
Acknowledgments
A large systems effort such as the development of Pup reflects the efforts of many different participants. Other people who have implemented parts of Pup and contributed ideas include Will Crowther, Yogen Dalal, Hal Murray, Bob Sproull, Larry Stewart, Jim White, and Graeme Williams.
We also wish to thank Danny Cohen, Dave Crocker, Bob Kahn, Jon Postel, and Carl Sunshine for their careful reading of an earlier draft of this paper.
References
[Cerf & Kahn, 1974]
Vinton Cerf and Robert Kahn, A Protocol for Packet Network Interconnection, IEEE Transactions on Communications, vol. Com-22 no. 5, May 1974.
[Cerf & Kirstein, 1978]
Vinton Cerf and Peter Kirstein, Issues in Packet-Network Interconnection, Proceedings of the IEEE, vol. 66 no. 11, November 1978.
[Cohen, 1977]
Dan Cohen, Issues in Transnet Packetized Voice Communication, 5th Data Communications Symposium, Snowbird, Utah, September 1977.
[Crocker et al., 1972]
S. D. Crocker, J. F. Heafner, R. M. Metcalfe, and J. B. Postel, Function-oriented Protocols for the ARPA Computer Network, AFIPS Conference Proceedings (SJCC), vol. 40, 1972.
[Dalal, 1977]
Yogen K. Dalal, Broadcast Protocols in Packet Switched Computer Networks, Technical Report 128, Stanford University Digital Systems Laboratory, April 1977.
[Dalal & Metcalfe, 1978]
Yogen K. Dalal and Robert M. Metcalfe, Reverse Path Forwarding of Broadcast Packets, Communications of the ACM, vol. 21 no. 12, December 1978.
[Data General, 1971]
Data General Corp., Type 4038 Multiprocessor Communications Adapter, Technical Reference 014-000002-01, September 1971.
[Davidson et al., 1977]
J. Davidson, W. Hathaway, J. Postel, N. Mimno, R. Thomas, and D. Walden, The Arpanet Telnet Protocol: Its Purpose, Principles, Implementation, and Impact on Host Operating System Design, 5th Data Communications Symposium, Snowbird, Utah, September 1977.
[Feinler & Postel, 1978]
E. Feinler and J. Postel, eds., Telnet Protocol Specification, Arpanet Protocol Handbook, January 1978.
[Kahn et al., 1978]
R. E. Kahn, S. A. Gronemeyer, J. Burchfiel, and R. C. Kunzelman, Advances in Packet Radio Technology, Proceedings of the IEEE, vol. 66 no. 11, November 1978.
[Kay, 1977]
Alan C. Kay, Microelectronics and the Personal Computer, Scientific American, vol. 237 no. 3, September 1977.
[McQuillan, 1974]
John M. McQuillan, Adaptive Routing Algorithms for Distributed Computer Networks, Harvard Ph.D. thesis, Report no. 2831, Bolt Beranek and Newman, May 1974.
[McQuillan and Walden, 1977]
John M. McQuillan and David C. Walden, The ARPANET Design Decisions, Computer Networks, vol. 1 no. 5, August 1977.
[Metcalfe, 1973]
Robert M. Metcalfe, Packet Communication, Harvard Ph.D. thesis, M.I.T. Project MAC TR-114, December 1973.
[Metcalfe & Boggs, 1976]
Robert Metcalfe and David Boggs, Ethernet: Distributed Packet Switching for Local Computer Networks, Communications of the ACM, vol. 19 no. 7, July 1976.
[Needham & Schroeder, 1978]
Roger Needham and Michael Schroeder, Using Encryption for Authentication in Large Networks of Computers, Communications of the ACM, vol. 21 no. 12, December 1978.
[Postel, 1980]
Jon Postel, Internetwork Protocols, IEEE Transactions on Communications, this issue.
[Roberts, 1978]
Lawrence G. Roberts, The Evolution of Packet Switching, Proceedings of the IEEE, vol. 66 no. 11, November 1978.
[Shoch, 1978]
John F. Shoch, Internetwork Naming, Addressing, and Routing, 17th IEEE Computer Society International Conference (CompCon), September 1978.
[Shoch, 1979a]
John F. Shoch, Packet Fragmentation in Internetwork Protocols, Computer Networks, vol. 3 no. 1, February 1979.
[Shoch, 1979b]
John F. Shoch, Design and Performance of Local Computer Networks, Stanford Ph.D. thesis, University Microfilms, August 1979.
[Shoch & Hupp, 1979]
John F. Shoch and Jon A. Hupp, Performance of an Ethernet Local Network—a Preliminary Report, Local Area Network Symposium, Boston, May 1979.
[Shoch & Stewart, 1979]
John F. Shoch and Lawrence Stewart, Interconnecting Local Networks via the Packet Radio Network, 6th Data Communications Symposium, Pacific Grove, November 1979.
[Sproull & Cohen, 1978]
Robert F. Sproull and Dan Cohen, High-Level Protocols, Proceedings of the IEEE, vol. 66 no. 11, November 1978.
[Sproull & Thomas, 1974]
Robert Sproull and Elaine Thomas, A Network Graphics Protocol, Computer Graphics, vol. 8 no. 3, Fall 1974.
[Sunshine, 1977a]
Carl Sunshine, Interconnection of Computer Networks, Computer Networks, vol. 1, 1977.
[Sunshine, 1977b]
Carl Sunshine, Source Routing in Computer Networks, ACM Computer Communications Review, vol. 7 no. 1, January 1977.
[Sunshine & Dalal, 1978]
Carl Sunshine and Yogen Dalal, Connection Management in Transport Protocols, Computer Networks, vol. 2, 1978.
[Swinehart et al., 1979]
Dan Swinehart, Gene McDaniel, and David Boggs, WFS: A Simple Shared File System for a Distributed Environment, Operating Systems Review, vol. 13 no. 5, November 1979.
[Teitelman, 1977]
Warren Teitelman, A Display-Oriented Programmer’s Assistant, Fifth International Joint Conference on Artificial Intelligence, Cambridge, Mass., August 1977; also available as Xerox PARC technical report CSL-77-3.
[Thacker et al., 1979]
C. P. Thacker, E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs, Alto: A Personal Computer, Computer Structures: Readings and Examples (Siewiorek, Bell, and Newell, eds.), 1979.
[Zimmermann, 1980]
H. Zimmermann, The ISO Reference Model, IEEE Transactions on Communications, this issue.