Understanding Cyclical Learning Rates for Training Neural Networks

This paper introduces a new method for setting the learning rate in deep neural networks, called Cyclical Learning Rates (CLR). CLR allows the learning rate to vary cyclically between reasonable boundary values, rather than monotonically decreasing as in traditional methods. This approach eliminates the need for extensive experimentation to find the best learning rate schedule and often achieves improved classification accuracy with fewer iterations. The paper also describes a simple method to estimate the "reasonable bounds" for the learning rate by linearly increasing the learning rate for a few epochs. Experiments on the CIFAR-10 and CIFAR-100 datasets with ResNets, Stochastic Depth networks, and DenseNets, as well as on the ImageNet dataset with AlexNet and GoogLeNet architectures, demonstrate the effectiveness of CLR. The results show that CLR can achieve near-optimal classification accuracy with less computational cost compared to adaptive learning rates. The paper concludes by highlighting the practical benefits of CLR and suggesting further research directions.This paper introduces a new method for setting the learning rate in deep neural networks, called Cyclical Learning Rates (CLR). CLR allows the learning rate to vary cyclically between reasonable boundary values, rather than monotonically decreasing as in traditional methods. This approach eliminates the need for extensive experimentation to find the best learning rate schedule and often achieves improved classification accuracy with fewer iterations. The paper also describes a simple method to estimate the "reasonable bounds" for the learning rate by linearly increasing the learning rate for a few epochs. Experiments on the CIFAR-10 and CIFAR-100 datasets with ResNets, Stochastic Depth networks, and DenseNets, as well as on the ImageNet dataset with AlexNet and GoogLeNet architectures, demonstrate the effectiveness of CLR. The results show that CLR can achieve near-optimal classification accuracy with less computational cost compared to adaptive learning rates. The paper concludes by highlighting the practical benefits of CLR and suggesting further research directions.

Cyclical Learning Rates for Training Neural Networks

4 Apr 2017 | Leslie N. Smith