MINIMIZING ENERGY COSTS IN DEEP LEARNING MODEL TRAINING: THE GAUSSIAN SAMPLING APPROACH

MINIMIZING ENERGY COSTS IN DEEP LEARNING MODEL TRAINING: THE GAUSSIAN SAMPLING APPROACH

11 Jun 2024 | Challapalli Phanindra Revanth, Sumohana S Channappayya, C Krishna Mohan
This paper addresses the significant energy consumption associated with backpropagation in deep learning (DL) model training. The authors propose a novel approach called *GradSamp* to efficiently compute gradients by sampling them from a Gaussian distribution. *GradSamp* updates model parameters at specific epochs by perturbing the previous epoch's parameters with Gaussian noise, estimated using the error between the two previous epoch's parameter values. This method not only streamlines gradient computation but also enables skipping entire epochs, enhancing overall efficiency. The authors validate their hypothesis across various standard and non-standard CNN and transformer-based models, including tasks such as image classification, object detection, and image segmentation. Additionally, they explore the efficacy of *GradSamp* in out-of-distribution scenarios like Domain Adaptation (DA), Domain Generalization (DG), and decentralized settings like Federated Learning (FL). Experimental results demonstrate that *GradSamp* achieves notable energy savings without compromising performance, highlighting its versatility and potential impact in practical DL applications.This paper addresses the significant energy consumption associated with backpropagation in deep learning (DL) model training. The authors propose a novel approach called *GradSamp* to efficiently compute gradients by sampling them from a Gaussian distribution. *GradSamp* updates model parameters at specific epochs by perturbing the previous epoch's parameters with Gaussian noise, estimated using the error between the two previous epoch's parameter values. This method not only streamlines gradient computation but also enables skipping entire epochs, enhancing overall efficiency. The authors validate their hypothesis across various standard and non-standard CNN and transformer-based models, including tasks such as image classification, object detection, and image segmentation. Additionally, they explore the efficacy of *GradSamp* in out-of-distribution scenarios like Domain Adaptation (DA), Domain Generalization (DG), and decentralized settings like Federated Learning (FL). Experimental results demonstrate that *GradSamp* achieves notable energy savings without compromising performance, highlighting its versatility and potential impact in practical DL applications.
Reach us at info@study.space