[slides and audio] Activation Functions%3A Comparison of trends in Practice and Research for Deep Learning

This paper provides a comprehensive survey of activation functions (AFs) used in deep learning applications, highlighting recent trends in their usage. It compiles various AFs, including Sigmoid, Hyperbolic Tangent, Softmax, Softsign, ReLU, and its variants (LReLU, PReLU, RReLU, SReLU), Exponential Linear Units (ELUs) and their variants (PELU, SELU), Maxout, Swish, and ELiSH. The paper discusses the evolution of these functions, their applications, and the challenges they address, such as vanishing and exploding gradients. It also compares the performance of different AFs in practical deep learning deployments against state-of-the-art research results. The findings suggest that while newer AFs like ELUs and their variants show promise, the ReLU remains the most widely used and effective activation function in deep learning, despite some newer functions outperforming it in specific tasks. The paper concludes by emphasizing the importance of choosing the most suitable AF for specific applications and the need for further research to explore the potential of newer AFs.This paper provides a comprehensive survey of activation functions (AFs) used in deep learning applications, highlighting recent trends in their usage. It compiles various AFs, including Sigmoid, Hyperbolic Tangent, Softmax, Softsign, ReLU, and its variants (LReLU, PReLU, RReLU, SReLU), Exponential Linear Units (ELUs) and their variants (PELU, SELU), Maxout, Swish, and ELiSH. The paper discusses the evolution of these functions, their applications, and the challenges they address, such as vanishing and exploding gradients. It also compares the performance of different AFs in practical deep learning deployments against state-of-the-art research results. The findings suggest that while newer AFs like ELUs and their variants show promise, the ReLU remains the most widely used and effective activation function in deep learning, despite some newer functions outperforming it in specific tasks. The paper concludes by emphasizing the importance of choosing the most suitable AF for specific applications and the need for further research to explore the potential of newer AFs.

Activation Functions: Comparison of Trends in Practice and Research for Deep Learning

8 Nov 2018 | Chigozie Enyinna Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall