[slides and audio] GraphLab%3A A New Framework For Parallel Machine Learning

GraphLab is a new parallel framework designed to address the challenges of designing and implementing efficient, provably correct parallel machine learning (ML) algorithms. It improves upon existing high-level abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving high parallel performance. The framework targets common patterns in ML, such as sparse data dependencies and iterative computation, and provides a graph-based data model that represents both data and computational dependencies. GraphLab includes concurrent access models that provide sequential-consistency guarantees, a sophisticated modular scheduling mechanism, and an aggregation framework for managing global state. The paper demonstrates the expressiveness of GraphLab by designing and implementing parallel versions of several popular ML algorithms, including belief propagation, Gibbs sampling, Co-EM, Lasso, and compressed sensing, on large-scale real-world problems. The results show that GraphLab can achieve excellent parallel performance, making it a powerful tool for applying rich structured models to rapidly scaling real-world problems.GraphLab is a new parallel framework designed to address the challenges of designing and implementing efficient, provably correct parallel machine learning (ML) algorithms. It improves upon existing high-level abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving high parallel performance. The framework targets common patterns in ML, such as sparse data dependencies and iterative computation, and provides a graph-based data model that represents both data and computational dependencies. GraphLab includes concurrent access models that provide sequential-consistency guarantees, a sophisticated modular scheduling mechanism, and an aggregation framework for managing global state. The paper demonstrates the expressiveness of GraphLab by designing and implementing parallel versions of several popular ML algorithms, including belief propagation, Gibbs sampling, Co-EM, Lasso, and compressed sensing, on large-scale real-world problems. The results show that GraphLab can achieve excellent parallel performance, making it a powerful tool for applying rich structured models to rapidly scaling real-world problems.

GraphLab: A New Framework For Parallel Machine Learning

| Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph Hellerstein