Understanding Shortcut learning in deep neural networks

Deep learning has become a cornerstone of artificial intelligence, but its limitations are increasingly apparent. This perspective aims to identify and address the underlying issue of *shortcut learning*, where models perform well on standard benchmarks but fail in more challenging real-world scenarios. Shortcuts are decision rules that exploit superficial correlations in training data, leading to poor generalization and unexpected failures. The authors draw parallels between shortcut learning in artificial systems and biological neural networks, highlighting common issues in comparative psychology, education, and linguistics. They propose a taxonomy of decision rules, emphasizing the distinction between intended and shortcut solutions. The paper discusses the origins of shortcuts, including dataset biases and feature combination, and explores their impact on various deep learning applications such as computer vision, natural language processing, agent-based learning, and fairness. To diagnose and understand shortcut learning, the authors recommend careful interpretation of results, testing on out-of-distribution (o.o.d.) data, and designing robust models that avoid shortcut learning. The paper concludes by outlining research directions to overcome shortcut learning, including domain-specific prior knowledge and adversarial examples.Deep learning has become a cornerstone of artificial intelligence, but its limitations are increasingly apparent. This perspective aims to identify and address the underlying issue of *shortcut learning*, where models perform well on standard benchmarks but fail in more challenging real-world scenarios. Shortcuts are decision rules that exploit superficial correlations in training data, leading to poor generalization and unexpected failures. The authors draw parallels between shortcut learning in artificial systems and biological neural networks, highlighting common issues in comparative psychology, education, and linguistics. They propose a taxonomy of decision rules, emphasizing the distinction between intended and shortcut solutions. The paper discusses the origins of shortcuts, including dataset biases and feature combination, and explores their impact on various deep learning applications such as computer vision, natural language processing, agent-based learning, and fairness. To diagnose and understand shortcut learning, the authors recommend careful interpretation of results, testing on out-of-distribution (o.o.d.) data, and designing robust models that avoid shortcut learning. The paper concludes by outlining research directions to overcome shortcut learning, including domain-specific prior knowledge and adversarial examples.

Shortcut Learning in Deep Neural Networks

21 Nov 2023 | Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix A. Wichmann