\boldhead{2. Deletion and Death Certificates}
Using either anti-entropy or rumor mongering, we cannot delete an item from the database simply by removing a local copy of the item, expecting the {\it absence} of the item to spread to other sites. Just the opposite will happen: the propagation mechanism will spread old copies of the item from elsewhere in the database back to the site where we have deleted it. Unless we can simultaneously delete all copies of an obsolete item from the database, it will eventually be ``resurrected'' in this way.
To remedy this problem we replace deleted items with {\it death certificates}, which carry timestamps and spread like ordinary data. During propagation, when old copies of deleted items meet death certificates, the old items are removed. If we keep a death certificate long enough it eventually cancels all old copies of its associated item. Unfortunately, this does not completely solve the problem. We still must decide when to delete the death certificates themselves or they will ultimately consume all available storage at the sites.
One strategy is to retain each death certificate until it can be determined that every site has received it. Sarin and Lynch [Sa] describe a protocol for making this determination, based on the distributed snapshot algorithm of Chandy and Lamport [Ch]. Separate protocols are needed for adding and removing sites (Sarin and Lynch do not describe the site addition protocol in any detail). If any site fails permanently between the creation of a death certificate and the completion of the distributed snapshot, that death certificate cannot be deleted until the site removal protocol has been run. For a network of several hundred sites this fact can be quite significant. In our experience, there is a fairly high probability that at any time some site will be down (or unreachable) for hours or even days, preventing the distributed snapshot or site deletion algorithm from completing.
A much simpler strategy is to hold death certificates for some fixed time, such as 30 days, and then discard them. With this scheme, we run the risk of obsolete items older than the threshold being resurrected, as described above. There is a clear tradeoff between the amount of space devoted to death certificates and the risk of obsolete items being resurrected: increasing the time threshold reduces the risk but increases the amount of space consumed by death certificates.