June 2011 | Feng Niu, Benjamin Recht, Christopher Ré and Stephen J. Wright
The paper "HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent" by Feng Niu, Benjamin Recht, Christopher Ré, and Stephen J. Wright introduces a novel parallelization scheme for Stochastic Gradient Descent (SGD) called HOGWILD!. This scheme allows processors to access shared memory without locking, which is a common requirement for parallel SGD implementations. The authors argue that when the optimization problem is sparse, meaning that most gradient updates only modify small parts of the decision variable, HOGWILD! achieves a nearly optimal rate of convergence. They provide theoretical guarantees and experimental results demonstrating that HOGWILD! outperforms alternative locking-based schemes by an order of magnitude on various machine learning tasks. The paper also discusses the conditions under which the algorithm converges and presents a robust $1/k$ convergence rate for constant stepsize SGD schemes. The authors conclude by highlighting the practical benefits of HOGWILD! and suggesting future directions for improving parallel SGD algorithms.The paper "HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent" by Feng Niu, Benjamin Recht, Christopher Ré, and Stephen J. Wright introduces a novel parallelization scheme for Stochastic Gradient Descent (SGD) called HOGWILD!. This scheme allows processors to access shared memory without locking, which is a common requirement for parallel SGD implementations. The authors argue that when the optimization problem is sparse, meaning that most gradient updates only modify small parts of the decision variable, HOGWILD! achieves a nearly optimal rate of convergence. They provide theoretical guarantees and experimental results demonstrating that HOGWILD! outperforms alternative locking-based schemes by an order of magnitude on various machine learning tasks. The paper also discusses the conditions under which the algorithm converges and presents a robust $1/k$ convergence rate for constant stepsize SGD schemes. The authors conclude by highlighting the practical benefits of HOGWILD! and suggesting future directions for improving parallel SGD algorithms.