Copyright Xerox Corporation 1979Inter-Office MemorandumToCommunication ProtocolsDateJuly 1, 1978FromEd TaftLocationPalo AltoSubjectNaming and Addressing ConventionsOrganizationPARC/CSLfor PupXEROX Filed on: [Maxc1]PupName.pressThis is a revision to a memo with the same title dated October 16, 1975. A small number of minorchanges have been made, but the principal motivation for this revision is to re-issue the memo inBravo and Press formats.This memo still contains the original justifications for the design that led to the presentimplementation. Remarks based on our experience with this design are presented in a small font.IntroductionThe questions to be touched in this memo are:1.How are networks, hosts, and processes to be named? That is, what form of identification isto be used by someone attempting to establish contact with some service or resource? It isdesirable that the naming convention we adopt require little or no user knowledge of host orprocess numbering conventions or (inter)network topology.2.How are hosts and processes to be addressed? At the lowest level, a port address is merely a triple (a 48-bit number). However, a single host may be connected tomore than one network or to the same network more than once, and a single process (ormultiple identical instances of a process) may be associated with multiple ports on the sameor different networks, possibly even on different hosts.3.How does a host discover what network it is on and what hosts are gateways? How do wehandle the degenerate cases in which a host does not know what network it is on and/orthere are no accessible gateways? (These questions are of particular importance in thePARC environment because it is impractical to attempt to store the necessary informationpermanently in each host).This memo discusses these questions in some detail and describes the conventions that have beenadopted.Host NamesMost internetworking protocols, if they discuss naming conventions at all, specify a simplehierarchical scheme in which each network has its own name space, independent of the name spacesof other networks. In order to reference an internetwork process address, it is necessary to specify(1) the network name, (2) the target host name (whatever it is called on that network), and (3) the &pX]g~qi cr]pX-r7Bp \r]p-r7Bp Vr]p!-r 7Bp]TOsp I# CK BDI @ =S  97L } 2-  t p 6% )7 ] >% >/d Naming and Addressing Conventions for Pup2name (or number) of the process, port, or whatever on that host.In the case of the PARC networks, it seems desirable to adopt a somewhat different approach: thata single name space encompasses all networks. This really says two things:1.A given host or process name refers to one and only one host or process (or possibly to a setof identical instances of a host or process).2.For any given host or process that has more than one network address, all such addressesare associated with a single name.Such a scheme has several benefits. First, the single name space allows users to specify a target hostor process by means of a simple identifier rather than by a full pathname that assumes userknowledge of (inter)network topology. For example, to refer to the Ivy file server, it is necessary tospecify only a single name (e.g. "Ivy") rather than a full pathname such as "Parc-Net3+Ivy+3".Second, even if a host resides on more than one network, or a process listens on several ports on thesame or different networks, it is unnecessary for the user to make a decision (specified though a fullpathname) distinguishing between these choices. Since this is nothing more than a special case ofmessage routing decisions that hosts and gateways must make anyway, it seems quite reasonable toleave such a choice completely in the hands of host and gateway software (unless there is a specialreason the user wants to specify a full pathname).Finally, in the case where there exist more than one identical instance of a particular host or process,it might be advantageous to give all such instances the same name. For example, if several hostsprovide a "telephone directory" service, or several hosts are prepared to accept error statistics fromAlto memory diagnostics, there is no reason not to give all instances of such a service the samename. One must take care in application of the "identicalness" criterion, however; e.g. it would bepoor form if printer output for a resident of Building 35 were actually sent to a "printer" server inBuilding 34.As our inter-network has grown, the obvious drawbacks of a universal non-hierarchical name space have made themselvesfelt. We now favor a hierarchical name space of at least two levels, but based on organizational or geographicalconsiderations rather than on network topology.With these considerations in mind, we have adopted the following naming conventions. Name toaddress translation is performed using a universal network directory. This directory is currentlymaintained as a file on Maxc (see the memo The Pup Network Directory and the PUPNM JSYS)and is distributed automatically to all hosts (principally Gateways) that provide Name Lookupservers (see Miscellaneous Services).Each entry in the directory consists of a list of names, a list of addresses, and a list of attributes.A name is a string consisting of alphanumerics plus the characters '-' and '/' (others may beadmitted by popular demand). Both upper and lower case alphabetics may be used; two namesdiffering only in the case of their letters are considered identical. If more than one name appears inan entry's name list, all the names are synonyms and may be used interchangeably.An address is a triple consisting of a network number, a host number, and a socket number, asdefined in the Pup specification. In the case that all three of these numbers are nonzero, theaddress completely specifies a Pup port. However, if one or more of the numbers is zero (i.e.unspecified), the address represents a subset of all possible ports.The classes of addresses that we expect to find useful in practice are the following:1.Network addresses, in which only the network number is specified.2.Host addresses, in which the network and host numbers are specified.3.Port addresses, in which all three numbers are specified. fp)G b@ _9R ]u,p Z?YJ- VeETu p Qb Pv; N>( MlH JM I51 G}/2 E1/ Ds[ B2 @ N >[ <P ;zW 9^ 8p6/ 6 4)r+J 20A 1/ .p@ -i3up +u(p *_= ( up %2up upu p #upW !P #D Q upu pu pu p _ #up"  D (U Cup0 ^u p6 yu p+& 2>/\lNaming and Addressing Conventions for Pup34.Well-known sockets, addresses in which only the socket number is specified.The addresses in a directory entry's address list describe alternative ways of accessing the entityassociated with the entry (or multiple, identical instances of that entity). Accordingly, we requirethat all addresses in an address list have corresponding patterns of specified and unspecifiedelements.An attribute is merely a pair consisting of an attribute name and an attribute value. The attributename is composed according to the same rules as the names in a directory entry's name list, whilethe attribute value may be an arbitrary string.In addition to the attribute list associated with an entire directory entry, individual addresses in theentry's address list may have attribute lists associated with them.In practice, we have not used attributes in any interesting way. The present network directory records Alto locations andowners as attributes, but even these are not always up-to-date. This is another unfavorable effect of utilizing a single,centrally-maintained data base.A port name expression is composed of name strings and address constants joined by the operator'+'.A name is one of the name strings defined in the network directory, as described above. Its value isthe associated list of addresses.An address constant is in the form # # where the numbers are specified in octal. An element of this constant may be left unspecified bysupplying zero or by leaving it out entirely. Leading '#'s may be omitted. For example,"0#0#3", "##3", and "3" all denote an address constant with network and host numbersunspecified and socket number 3, while "3##" denotes network number 3.Names and address constants may be combined by means of the '+' operator, which is roughlyspeaking an intersection operator. Its effect is to make an expression whose value is more specific(i.e. contains fewer unspecified elements) than either of its operands. For example, the value of"3##+##123" is "3##123". If a particular element is specified in both operands but withconflicting values, the intersection is empty.When either of the operands is a name whose value is a list of addresses, the resulting value is alsopotentially a list. For example, the value of "Maxc1" in the network directory is the list 1#1#,2#1#, 3#200#, 4#40#. Hence the value of the expression "Maxc1+123" is the list 1#1#123,2#1#123, 3#200#123, 4#40#123. However, the value of the expression "3##+Maxc1" (or"Parc-Net3+Maxc1") is a single address, 3#200#, since the intersections of the given addressconstant with the other addresses in the list are empty.Port Addresses and Packet DestinationsGiven a naming convention such as that just described, one might argue that a Pup should beallowed to have a name as a destination, or (an equivalent and more practical alternative) a list ofaddresses rather than a single address. Hosts and gateways would be free to route packets to anyone of the listed destinations, with the choice being made dynamically as network load or topologychanged over time.Such a scheme is unfortunately too cumbersome to be practical. Among other things, it wouldrequire the already over-long Pup header to be further extended. Furthermore, anomalies couldarise in the case where a name referred, not to multiple addresses for the same process, but rather tomultiple instances of a process. fp)G bup9 _9K ]V \/V Z WupU V@,5 T/ Q,< PQC Mr/K LRH2 K HTpup(! F Cup_ Be! ?"<2 9J 81? 6C 5'F 2BR 0V /86, -A ,.. )Ie ' T &?7! $S #5/- !8 #t& >pT up$#up 4A F * E> K ;X  o>/[DNaming and Addressing Conventions for Pup4It is the intent of this proposal, therefore, that the choice between alternative addresses be made atthe time of the initial connection or rendezvous and remain fixed for the life of a "connection"(whatever we mean by that; e.g. as characterized by the Byte Stream Protocol). All packets aremarked as arising from a single, specified source port and are delivered to a single, specifieddestination port.Of course, the foregoing statement does not prohibit hosts from implementing ports (processconnections) that will accept Pups from any of several sources or directed to any of severaldestinations. All existing implementations support such a "wildcard" capability.Nor do we restrict the manner in which Pups may be routed between hosts and gateways. Forexample, if two hosts are connected to each other by two networks, it is perfectly reasonable totransmit over one network a Pup whose destination port specifies the other network. This capabilitypermits communications to continue without interruption should one of the networks go down.In practice, the existence of hosts with multiple inter-network addresses has led to significant complications in Pupaddressing and routing. We presently favor a scheme in which each host has a single, unique ID that is independent ofthe identity of the network(s) to which the host is connected. Pups directed to that host carry a destination networknumber as a hint as to how the host may be reached, but the hint does not participate in the identification of the hostitself and may be changed at any time.Degenerate CasesAn important consideration in the PARC environment is that no hosts besides gateways should needinitially to know what network they are on. Given the portability of Alto disk packs and even ofentire machines, there is no practical way to permanently build into each host the networknumber(s) of the connected network(s).This consideration leads to two requirements:1.Intra-network communication should be possible without any of the hosts involved knowingwhat network they are connected to (though they must somehow know that they areconnected to the same network). Such a situation might arise if the network were in fact notconnected to any other network, or if all available gateways were down.2.For the purposes of initiating inter-network communication, it should be possible for a hostto (a) discover what network it is on and (b) discover what hosts on that network aregateways, and to what other networks.To meet these requirements, we first set down the convention that no network may have a networknumber of zero. A zero value in the source network number field of a Pup header is defined tomean that the sending host does not know what network it is connected to. A zero value in thedestination network field indicates that the Pup's destination is on the same network as its source.A user or user process desiring to initiate intra-network communication may transmit Pups to aspecific host on that network, setting both source and destination network number fields to zero. Inthis case, the assumption is that somehow the initiator of this communication knows that the twohosts are on the same network. Such "knowledge" might be obtained in various ways. For example,the (human) user might simply direct his Alto user process to establish contact with Ethernet host123, which is a specific machine which he knows is on the same Ethernet as his own machine.Note that the network directory described earlier is (strictly speaking) not usable by a host that does not know whatnetwork it is on, even if it is able to access that data base. That is, having looked up "Maxc2" and discovered that itspossible addresses are 3#5# and 4#240#, we cannot do anything with this information since we don't know whetherwe are connected to network 3 or network 4 (or neither), and hence have no way of deciding whether to transmit to host5 or host 240 on our "own" network, or even whether we can get there from here at all.This difficulty would be eliminated by the unique host ID scheme mentioned earlier. fp)G b-9 `upu p _4+ ]I \  Y%? WK VQ S63up! Q` P,R NE KrI, J\ Ikv H. vr'@ F& Bt ?p2. >U <J ;& 8/- 5J.*342@up70G -Q ,Q),*% 'W &b#: $5) #X22 sK 6/ i)%up V _L  up, rX  l P aZ $V S c>/YhNaming and Addressing Conventions for Pup5Network Determination and Gateway LocationAll the above problems go away if the host somehow discovers what network(s) it is connected to.Additionally, to perform inter-network communication, it is necessary for a host to know both whatnetwork it is on and what hosts on that network are gateways. Accordingly, all hosts should, atinitialization time (e.g. when "booted"), make use of the protocol about to be outlined. The actualspecification of the protocol is given in the memo Gateway Information Protocol.We first observe that all gateways must know what networks they are connected to. We thenprovide each gateway host with an gateway information service on a standard socket.A host desiring to discover its local network number and gateways broadcasts a Gateway InformationRequest Pup to this socket at all (or a selected subset of) the hosts on its own network (i.e., settingthe source and destination network fields to zero). A destination host number of zero is specifically reserved todistinguish a Pup that is being broadcast. All the operating gateways on that network then reply with aGateway Information Reply Pup that includes the actual source and destination network numbers inthe Pup header (since the gateway host always knows the identity of the network over which therequest Pup arrived). The body of the message includes routing information indicating what othernetworks are accessible via that gateway.The connected network number should now be saved by the requesting host for inclusion in thesource network field of all subsequent outgoing Pups, and the routing information should be usedfor determining where to send packets addressed to hosts on other networks. fp)G bt* _9pJ ]up' \/>" Zd Y%3up V@#up3 T"up Q-"u PQp(up N5r$ MGp> KupG J=-1 HG G3) DN#9 BL ADK @?/'  TIMESROMAN  TIMESROMAN  TIMESROMANLOGO TIMESROMAN  TIMESROMAN  TIMESROMAN   %Z*j/-+ pupname.bravoTaftSeptember 3, 1979 5:44 PM