[_CD6_]<ralpine>CHSpaper>S0r2.tex!26

\boldhead{0.2 Related Work}

The algorithms in this paper are intended to maintain a widely-replicated directory, or name look-up, database. Rather than using transaction-based mechanisms that attempt to achieve ``one-copy serializability'' (for example, [Gi]), we use mechanisms that drive the replicas towards eventual agreement. Such mechanisms were apparently first proposed by Johnson et al. [Jo] and have been used in Grapevine [Bi] and Clearinghouse [Op]. Experience with these systems has suggested that some problems remain; in particular, that some updates (with low probability) do not reach all sites. Lampson [La] proposes a hierarchical data structure that avoids high replication, but still requires some replication of each component, say by six to a dozen servers. Primary-site update algorithms for replicated databases have been proposed that synchronize updates by requiring them to be applied to a single site; the update site then takes responsibility for propagating updates to all replicas. The DARPA domain system, for example, employs an algorithm of this sort [Mo]. Primary-site update avoids problems of update distribution addressed by the algorithms described in this paper but suffers from centralized control.

Two features distinguish our algorithms from previous mechanisms. First, the previous mechanisms depend on various guarantees from underlying communications protocols and on maintaining consistent distributed control structures. For example, in Clearinghouse the initial distribution of updates depends on an underlying guaranteed mail protocol, which in practice fails from time to time due to physical queue overflow, even though the mail queues are maintained on disk storage. Sarin and Lynch [Sa] present a distributed algorithm for discarding obsolete data that depends on guaranteed, properly ordered, message delivery, together with a detailed data structure at each server (of size $O(n^2 )$) describing all other servers for the same database. Lampson et al. [La] envision a sweep moving deterministically around a ring of servers, held together by pointers from one server to the next. These algorithms depend upon various mutual consistency properties of the distributed data structure, e.g., in Lampson's algorithm the pointers must define a ring. The algorithms in this paper merely depend on eventual delivery of repeated messages and do not require data structures at one server describing information held at other servers.

Second, the algorithms described in this paper are randomized; that is, there are points in the algorithm at which each server makes an independent random choice [Ra, Be85]. In distinction, the previous mechanisms are deterministic. For example, in both the anti-entropy and the rumor mongering algorithms, a server randomly chooses a partner. In some versions of the rumor mongering algorithm, a server makes a random choice to either remain infective or become removed. The use of random choice prevents us from making such claims as: ``the information will converge in time proportional to the diameter of the network.'' The best that we can claim is that in the absence of further updates, the probability that the information has not converged is exponentially decreasing with time. On the other hand, we believe that the use of randomized protocols makes our algorithms straightforward to implement correctly using simple data structures.