[slides and audio] Epidemic algorithms for replicated database maintenance

This paper discusses the problem of maintaining consistency among replicated databases in a large, heterogeneous, and slowly changing network. The authors propose several randomized algorithms to distribute updates and drive replicas towards consistency, focusing on simplicity and minimal guarantees from the underlying communication system. The algorithms are analogous to epidemics in epidemiology, and their behavior is analyzed using epidemiological theory. One of the algorithms, "rumor mongering," has been implemented in the Clearinghouse servers of the Xerox Corporate Internet, addressing long-standing issues of high traffic and database inconsistency. The paper introduces three main methods for spreading updates: 1. **Direct Mail**: Updates are immediately sent to all other sites, but this method is unreliable due to potential message loss and incomplete site knowledge. 2. **Anti-entropy**: Sites regularly choose another site at random to exchange database contents, ensuring consistency but requiring frequent updates. 3. **Rumor Mongering**: Sites initially hold updates as "hot rumors" and periodically choose another site to share the update, reducing network traffic compared to direct mail. The authors analyze the performance of these algorithms, focusing on time required for updates to propagate and network traffic. They also explore spatial distributions to reduce network traffic, such as favoring nearby sites. The paper includes simulations and practical experience to evaluate the effectiveness of these algorithms. The motivation for this work stems from the Clearinghouse servers on the Xerox Corporate Internet, where high traffic and consistency issues were prevalent. The authors describe the challenges faced with direct mail and anti-entropy, and propose solutions like rumor mongering and the use of checksums to improve performance and reduce network load. The paper concludes with a discussion on the design criteria for "good" epidemics, including residue, traffic, and delay, and explores variations of rumor spreading techniques. It also introduces the concept of backing up complex epidemics with anti-entropy to ensure that all updates eventually reach all sites. The paper emphasizes the importance of combining rumor mongering and anti-entropy to achieve efficient and reliable consistency in replicated databases.This paper discusses the problem of maintaining consistency among replicated databases in a large, heterogeneous, and slowly changing network. The authors propose several randomized algorithms to distribute updates and drive replicas towards consistency, focusing on simplicity and minimal guarantees from the underlying communication system. The algorithms are analogous to epidemics in epidemiology, and their behavior is analyzed using epidemiological theory. One of the algorithms, "rumor mongering," has been implemented in the Clearinghouse servers of the Xerox Corporate Internet, addressing long-standing issues of high traffic and database inconsistency. The paper introduces three main methods for spreading updates: 1. **Direct Mail**: Updates are immediately sent to all other sites, but this method is unreliable due to potential message loss and incomplete site knowledge. 2. **Anti-entropy**: Sites regularly choose another site at random to exchange database contents, ensuring consistency but requiring frequent updates. 3. **Rumor Mongering**: Sites initially hold updates as "hot rumors" and periodically choose another site to share the update, reducing network traffic compared to direct mail. The authors analyze the performance of these algorithms, focusing on time required for updates to propagate and network traffic. They also explore spatial distributions to reduce network traffic, such as favoring nearby sites. The paper includes simulations and practical experience to evaluate the effectiveness of these algorithms. The motivation for this work stems from the Clearinghouse servers on the Xerox Corporate Internet, where high traffic and consistency issues were prevalent. The authors describe the challenges faced with direct mail and anti-entropy, and propose solutions like rumor mongering and the use of checksums to improve performance and reduce network load. The paper concludes with a discussion on the design criteria for "good" epidemics, including residue, traffic, and delay, and explores variations of rumor spreading techniques. It also introduces the concept of backing up complex epidemics with anti-entropy to ensure that all updates eventually reach all sites. The paper emphasizes the importance of combining rumor mongering and anti-entropy to achieve efficient and reliable consistency in replicated databases.

Epidemic Algorithms for Replicated Database Maintenance

August 4, 1987 | Alan Demers, Dan Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard Sturgis, Dan Swinehart, and Doug Terry