[slides] Energy and Policy Considerations for Deep Learning in NLP

The paper "Energy and Policy Considerations for Deep Learning in NLP" by Emma Strubell, Ananya Ganesh, and Andrew McCallum from the University of Massachusetts Amherst addresses the significant computational and environmental costs associated with training large neural networks in Natural Language Processing (NLP). The authors highlight that recent advancements in hardware and methodology have led to the development of highly accurate models, but these models require substantial computational resources and energy, which are both financially and environmentally costly. They quantify the financial and environmental costs of training various NLP models, including Transformer, ELMo, BERT, and GPT-2, and estimate the carbon emissions and electricity costs. The paper also presents a case study on the computational resources required to develop the Linguistically-Informed Self-Attention (LISA) model, which took 172 days and involved extensive hyperparameter tuning. Based on these findings, the authors propose recommendations for reducing costs and improving equity in NLP research and practice, such as reporting training times and hyperparameter sensitivity, ensuring equitable access to computational resources, and prioritizing computationally efficient hardware and algorithms.The paper "Energy and Policy Considerations for Deep Learning in NLP" by Emma Strubell, Ananya Ganesh, and Andrew McCallum from the University of Massachusetts Amherst addresses the significant computational and environmental costs associated with training large neural networks in Natural Language Processing (NLP). The authors highlight that recent advancements in hardware and methodology have led to the development of highly accurate models, but these models require substantial computational resources and energy, which are both financially and environmentally costly. They quantify the financial and environmental costs of training various NLP models, including Transformer, ELMo, BERT, and GPT-2, and estimate the carbon emissions and electricity costs. The paper also presents a case study on the computational resources required to develop the Linguistically-Informed Self-Attention (LISA) model, which took 172 days and involved extensive hyperparameter tuning. Based on these findings, the authors propose recommendations for reducing costs and improving equity in NLP research and practice, such as reporting training times and hyperparameter sensitivity, ensuring equitable access to computational resources, and prioritizing computationally efficient hardware and algorithms.

Energy and Policy Considerations for Deep Learning in NLP

5 Jun 2019 | Emma Strubell, Ananya Ganesh, Andrew McCallum