A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning

A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning

November 13, 1990 | Martin F. Møller
This paper introduces a supervised learning algorithm called Scaled Conjugate Gradient (SCG) with superlinear convergence rate. SCG is based on conjugate gradient methods, which are well-known in numerical analysis. Unlike standard backpropagation (BP), SCG uses second-order information from the neural network but requires only O(N) memory, where N is the number of weights. SCG outperforms BP, CGB, and BFGS in terms of speed, achieving at least an order of magnitude faster convergence. SCG is fully automated, requiring no user-dependent parameters and avoiding time-consuming line searches. It effectively handles ravine phenomena in weight space, which are common in neural networks. SCG is tested on various problems, including the parity problem, logistic map, and =1 problem, showing superior performance compared to BP. The algorithm is efficient, with a calculation complexity of O(3N²) per iteration, and is suitable for large-scale problems. SCG's ability to handle ravine phenomena and its automated nature make it a more effective learning algorithm than BP. The paper concludes that SCG is a promising alternative to BP for supervised learning in neural networks.This paper introduces a supervised learning algorithm called Scaled Conjugate Gradient (SCG) with superlinear convergence rate. SCG is based on conjugate gradient methods, which are well-known in numerical analysis. Unlike standard backpropagation (BP), SCG uses second-order information from the neural network but requires only O(N) memory, where N is the number of weights. SCG outperforms BP, CGB, and BFGS in terms of speed, achieving at least an order of magnitude faster convergence. SCG is fully automated, requiring no user-dependent parameters and avoiding time-consuming line searches. It effectively handles ravine phenomena in weight space, which are common in neural networks. SCG is tested on various problems, including the parity problem, logistic map, and =1 problem, showing superior performance compared to BP. The algorithm is efficient, with a calculation complexity of O(3N²) per iteration, and is suitable for large-scale problems. SCG's ability to handle ravine phenomena and its automated nature make it a more effective learning algorithm than BP. The paper concludes that SCG is a promising alternative to BP for supervised learning in neural networks.
Reach us at info@study.space
Understanding A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning