\boldhead{1.3 Anti-entropy} The Grapevine designers recognized that handling failures of direct mail in a large network would be beyond people's ability. They proposed {\it anti-entropy} as a mechanism that could be run in the background to recover automatically from such failures [Bi]. Anti-entropy was not implemented as part of Grapevine, but the design was adopted essentially unchanged for the Clearinghouse. In its most basic form anti-entropy is expressed by the following algorithm periodically executed at each site $s$: \vskip 20pt \settabs 20 \columns \+&FOR SOME $s^\prime \in S$ DO\cr \+&&$\hbox{ResolveDifference}[s, s^\prime]$\cr \+&&ENDLOOP\cr \vskip 10pt The procedure $\hbox{ResolveDifference}[s, s^\prime]$ is carried out by the two servers in cooperation. Depending on its design, its effect may be expressed in one of three ways, called {\it push}, {\it pull} and {\it push-pull}: \vskip 20pt \+&$\hbox{ResolveDifference}: \hbox{PROC}[s, s^\prime] = \{$ {\it --\null-- push}\cr \+&&IF $s.\hbox{ValueOf}.t > s^\prime.\hbox{ValueOf}.t$ THEN\cr \+&&&$s^\prime.\hbox{ValueOf} \leftarrow s.\hbox{ValueOf}$\cr \+&&$\}$\cr \vskip 20pt \+&$\hbox{ResolveDifference}: \hbox{PROC}[s, s^\prime] = \{$ {\it --\null-- pull}\cr \+&&IF $s.\hbox{ValueOf}.t < s^\prime.\hbox{ValueOf}.t$ THEN\cr \+&&&$s.\hbox{ValueOf} \leftarrow s^\prime.\hbox{ValueOf}$\cr \+&&$\}$\cr \vskip 20pt \+&$\hbox{ResolveDifference}: \hbox{PROC}[s, s^\prime] = \{$ {\it --\null-- push-pull}\cr \+&&SELECT TRUE FROM\cr \+&&&$s.\hbox{ValueOf}.t > s^\prime.\hbox{ValueOf}.t \Rightarrow s^\prime.\hbox{ValueOf} \leftarrow s.\hbox{ValueOf}$;\cr \+&&&$s.\hbox{ValueOf}.t < s^\prime.\hbox{ValueOf}.t \Rightarrow s.\hbox{ValueOf} \leftarrow s^\prime.\hbox{ValueOf}$;\cr \+&&&ENDCASE $\Rightarrow$ NULL;\cr \+&&$\}$\cr \vskip 10pt For the moment we assume that site $s^\prime$ is chosen uniformly at random from the set $S$, and that each site executes the anti-entropy algorithm once per period. It is a basic result of epidemic theory that simple epidemics, of which anti-entropy is one, eventually infect the entire population. The theory also shows that starting with a single infected site this is achieved in expected time proportional to the log of the population size. The constant of proportionality is sensitive to which ResolveDifference procedure is used. For {\it push}, the exact formula is $ \log _2 (n) + \ln (n) + O(1) $ for large $n$ [Pi]. It is comforting to know that even if mail fails to spread an update beyond a single site, then anti-entropy will eventually distribute it throughout the network. Normally, however, we expect anti-entropy to distribute updates to only a few sites, assuming most sites receive them by direct mail. Thus, it is important to consider what happens when only a few sites remain susceptible. In that case the big difference in behavior is between {\it push} and {\it pull}, with {\it push-pull} behaving essentially like {\it pull}. A simple deterministic model predicts the observed behavior. Let $p_i $ be the probability of a site's remaining susceptible after the $i^{th} $ cycle of anti-entropy. For {\it pull}, a site remains susceptible after the $i+1^{st}$ cycle if it was susceptible after the $i^{th}$ cycle and it contacted a susceptible site in the $i+1^{st}$ cycle. Thus, we obtain the recurrence $$ p_{i+1} = (p_i )^2 $$ which converges very rapidly to 0 when $p_i$ is small. For {\it push}, a site remains susceptible after the $i+1^{st}$ cycle if it was susceptible after the $i^{th}$ cycle and no infectious site chose to contact it in the $i+1^{st}$ cycle. Thus, the analogous recurrence relation for {\it push} is $$ p_{i+1} = p_i \left( 1 - {1\over n}\right)^{n(1-p_i )} $$ which also converges to 0, but much less rapidly, since for very small $p_i$ (and large $n$) it is approximately $$p_{i+1} = p_i e^{-1} $$ This strongly suggests that in practice, when anti-entropy is used as a backup for some other distribution mechanism such as direct mail, either {\it pull} or {\it push-pull} is greatly preferable to {\it push}, which behaves poorly in the expected case. As expressed here the anti-entropy algorithm is very expensive, since it involves a comparison of two complete copies of the database, one of which is sent over the network. Normally the copies of the database are in nearly complete agreement, so most of the work is wasted. Given this observation, a possible performance improvement is for each site to maintain a checksum of its database contents, recomputing the checksum incrementally as the database is updated. Sites performing anti-entropy first exchange checksums, comparing their full databases only if the checksums disagree. This scheme saves a great deal of network traffic, assuming the checksums agree most of the time. Unfortunately, it is common for a very recent update to be known by some but not all sites. Checksums at different sites are likely to disagree unless the time required for an update to be sent to all sites is small relative to the expected time between new updates. As the size of the network increases, the time required to distribute an update to all sites increases, so the naive use of checksums described above becomes less and less useful. A more sophisticated approach to using checksums defines a time window $\tau$ large enough that updates are expected to reach all sites within time $\tau$. As in the naive scheme, each site maintains a checksum of its database. In addition, the site maintains a {\it recent update list}, a list of all entries in its database whose ages (measured by the difference between their timestamp values and the site's local clock) are less than $\tau$. Two sites $s$ and $s^\prime$ perform anti-entropy by first exchanging recent update lists, using the lists to update their databases and checksums, and then comparing checksums. Only if the checksums disagree do the sites compare their entire databases. Exchanging recent update lists before comparing checksums ensures that if one site has received a change or delete recently, the corresponding obsolete entry does not contribute to the other site's checksum. Thus, the checksum comparison is very likely to succeed, making a full database comparison unnecessary. In that case, the expected traffic generated by an anti-entropy comparison is just the expected size of the recent update list, which is bounded by the expected number of updates occurring on the network in time $\tau$. Note that the choice of $\tau$ to exceed the expected distribution time for an update is critical; if $\tau$ is chosen poorly, or if growth of the network drives the expected update distribution time above $\tau$, checksum comparisons will usually fail and network traffic will rise to a level slightly higher than what would be produced by anti-entropy without checksums. A simple variation on the above scheme, which does not require {\it a priori} choice of a value for $\tau$, can be used if each site can maintain an inverted index of its database by timestamp. Two sites perform anti-entropy by exchanging updates in reverse timestamp order, incrementally recomputing their checksums, until agreement of the checksums is achieved. While it is nearly ideal from the standpoint of network traffic, this scheme (which we will hereafter refer to as {\it peel back}) may not be desirable in practice because of the expense of maintaining an additional inverted index at each site. Κξ•WordlistΈBi timestamp incrementally recomputing checksum st th ln ENDCASE Rightarrow leftarrow ValueOf PROC servers ENDLOOP ResolveDifference hbox cr settabs pt vskip Clearinghouse bf noindent – "tex" style˜šœ˜Iblockšœω˜ωK˜K˜ Kšœ˜Kšœ`˜`K˜ K˜šœζ˜ζIcodešœμ˜μL˜˜ L˜—Lšœμ˜μL˜LšœZ˜ZLšœ˜Lšœ{˜{Lšœ{˜{Lšœ#˜#Lšœ˜—Kšœ₯˜₯K˜KšœΡ˜ΡKšœ„Οb‹˜K˜Kšœ8yœ*œ/ œ˜«K˜