Vol. 109, No. 5, May 2021 | By BERNHARD SCHÖLKOPF, FRANCESCO LOCATELLO, STEFAN BAUER, NAN ROSEMARY KE, NAL KALCHBRENNER, ANIRUDH GOYAL, AND YOSHUA BENGIO
This article reviews fundamental concepts of causal inference and explores their relevance to key challenges in machine learning, such as transfer learning and generalization. It highlights the importance of causality in modern machine learning research, particularly in addressing limitations such as robustness, learning reusable mechanisms, and understanding the nature of data. The authors discuss the differences between statistical and causal models, emphasizing the need for causal models to capture the underlying physical mechanisms that generate statistical dependencies. They introduce the Independent Causal Mechanisms (ICM) principle, which posits that causal generative processes are composed of autonomous modules that do not influence each other. This principle is used to derive the sparse mechanism shift (SMS) hypothesis, which suggests that interventions on one mechanism do not affect others. The article also reviews existing approaches to learning causal relations from data, including classical methods and modern deep neural network-based approaches. Finally, it discusses the implications of causality for practical machine learning tasks, such as robustness, generalization, and the integration of causal and statistical learning.This article reviews fundamental concepts of causal inference and explores their relevance to key challenges in machine learning, such as transfer learning and generalization. It highlights the importance of causality in modern machine learning research, particularly in addressing limitations such as robustness, learning reusable mechanisms, and understanding the nature of data. The authors discuss the differences between statistical and causal models, emphasizing the need for causal models to capture the underlying physical mechanisms that generate statistical dependencies. They introduce the Independent Causal Mechanisms (ICM) principle, which posits that causal generative processes are composed of autonomous modules that do not influence each other. This principle is used to derive the sparse mechanism shift (SMS) hypothesis, which suggests that interventions on one mechanism do not affect others. The article also reviews existing approaches to learning causal relations from data, including classical methods and modern deep neural network-based approaches. Finally, it discusses the implications of causality for practical machine learning tasks, such as robustness, generalization, and the integration of causal and statistical learning.