Workload Analysis of a Large-Scale Key-Value Store

Workload Analysis of a Large-Scale Key-Value Store

June 11-15, 2012 | Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, Mike Paleczny
This paper presents an in-depth analysis of the workload characteristics of Facebook's Memcached deployment, the world's largest key-value (KV) store. The study collects and analyzes over 284 billion requests from five different Memcached use cases over several days, revealing key insights into the behavior of KV stores in large-scale systems. The analysis covers various aspects, including request composition, size, rate, cache efficacy, temporal patterns, and application use cases. A simple model is proposed to generate more realistic synthetic workloads for the community. The study finds that Memcached's performance is highly dependent on its workload, and that the hit rate is influenced by factors such as the size of the cache pool. The analysis reveals that the GET/SET ratio is higher than previously assumed, and that some applications behave more like persistent storage than a cache. Strong locality metrics do not always ensure a high hit rate, and there is still room for efficiency and hit rate improvements in Memcached's implementation. The paper also examines the temporal patterns of the workload, showing diurnal and weekly variations. It presents an analytical model that can be used to generate more realistic synthetic workloads, finding that key and value sizes follow power-law distributions. The study also highlights the importance of workload modeling for understanding and optimizing KV stores, and provides a simple analytical model of the most representative workload. The paper concludes that the analysis of Memcached's workload provides valuable insights into the behavior of KV stores in large-scale systems, and that further research is needed to improve their performance, scalability, reliability, cost, and power consumption. The study also suggests that alternative replacement policies and improvements in memory allocation models could enhance the performance of KV stores.This paper presents an in-depth analysis of the workload characteristics of Facebook's Memcached deployment, the world's largest key-value (KV) store. The study collects and analyzes over 284 billion requests from five different Memcached use cases over several days, revealing key insights into the behavior of KV stores in large-scale systems. The analysis covers various aspects, including request composition, size, rate, cache efficacy, temporal patterns, and application use cases. A simple model is proposed to generate more realistic synthetic workloads for the community. The study finds that Memcached's performance is highly dependent on its workload, and that the hit rate is influenced by factors such as the size of the cache pool. The analysis reveals that the GET/SET ratio is higher than previously assumed, and that some applications behave more like persistent storage than a cache. Strong locality metrics do not always ensure a high hit rate, and there is still room for efficiency and hit rate improvements in Memcached's implementation. The paper also examines the temporal patterns of the workload, showing diurnal and weekly variations. It presents an analytical model that can be used to generate more realistic synthetic workloads, finding that key and value sizes follow power-law distributions. The study also highlights the importance of workload modeling for understanding and optimizing KV stores, and provides a simple analytical model of the most representative workload. The paper concludes that the analysis of Memcached's workload provides valuable insights into the behavior of KV stores in large-scale systems, and that further research is needed to improve their performance, scalability, reliability, cost, and power consumption. The study also suggests that alternative replacement policies and improvements in memory allocation models could enhance the performance of KV stores.
Reach us at info@study.space