Understanding Distributed GraphLab%3A A Framework for Machine Learning in the Cloud

The paper introduces the GraphLab framework, a high-level distributed abstraction for machine learning and data mining tasks, which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving high parallel performance in a shared-memory setting. The authors extend GraphLab to a distributed setting, addressing challenges such as network congestion, network latency, and data locality. They introduce pipelined locking and data versioning to reduce network congestion and mitigate the effects of network latency. Fault tolerance is achieved using the Chandy-Lamport snapshot algorithm. The distributed implementation of GraphLab is evaluated on Amazon EC2, showing 1-2 orders of magnitude performance gains over Hadoop-based implementations. The paper also discusses the benefits of asynchronous and dynamic computation in MLDM algorithms, and provides a detailed description of the distributed GraphLab execution model, including the chromatic and locking engines. The performance of GraphLab is demonstrated on three state-of-the-art MLDM applications: Netflix movie recommendations, Video Co-segmentation, and Named Entity Recognition, showing significant speedup over Hadoop and comparable performance to MPI implementations.The paper introduces the GraphLab framework, a high-level distributed abstraction for machine learning and data mining tasks, which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving high parallel performance in a shared-memory setting. The authors extend GraphLab to a distributed setting, addressing challenges such as network congestion, network latency, and data locality. They introduce pipelined locking and data versioning to reduce network congestion and mitigate the effects of network latency. Fault tolerance is achieved using the Chandy-Lamport snapshot algorithm. The distributed implementation of GraphLab is evaluated on Amazon EC2, showing 1-2 orders of magnitude performance gains over Hadoop-based implementations. The paper also discusses the benefits of asynchronous and dynamic computation in MLDM algorithms, and provides a detailed description of the distributed GraphLab execution model, including the chromatic and locking engines. The performance of GraphLab is demonstrated on three state-of-the-art MLDM applications: Netflix movie recommendations, Video Co-segmentation, and Named Entity Recognition, showing significant speedup over Hadoop and comparable performance to MPI implementations.

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

2012 | Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein