21 February 2014 | Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Mueller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion and Gaël Varoquaux
This paper explores the application of statistical machine learning methods, particularly those provided by the scikit-learn Python library, to neuroimaging data analysis. The authors aim to bridge the gap between machine learning and neuroimaging by demonstrating how scikit-learn can be used to perform key analysis steps in neuroimaging. They highlight the versatility of scikit-learn in handling high-dimensional datasets, such as activation images or resting-state time series, and its ability to support both supervised and unsupervised learning techniques. Supervised learning is used for decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images or find sub-populations in large cohorts. The paper discusses various applications of statistical learning to resolve common neuroimaging needs, detailing the corresponding code, the choice of methods, and the underlying assumptions. It also emphasizes the interpretability of results and the internal model of various methods. The paper is organized into sections covering data preparation, decoding, encoding, and functional connectivity analysis. It also discusses the use of scikit-learn in combination with other Python libraries such as nilearn to facilitate neuroimaging analysis. The authors conclude that scikit-learn provides a versatile tool for studying the brain and that the integration of machine learning with neuroimaging data analysis can lead to new scientific advances.This paper explores the application of statistical machine learning methods, particularly those provided by the scikit-learn Python library, to neuroimaging data analysis. The authors aim to bridge the gap between machine learning and neuroimaging by demonstrating how scikit-learn can be used to perform key analysis steps in neuroimaging. They highlight the versatility of scikit-learn in handling high-dimensional datasets, such as activation images or resting-state time series, and its ability to support both supervised and unsupervised learning techniques. Supervised learning is used for decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images or find sub-populations in large cohorts. The paper discusses various applications of statistical learning to resolve common neuroimaging needs, detailing the corresponding code, the choice of methods, and the underlying assumptions. It also emphasizes the interpretability of results and the internal model of various methods. The paper is organized into sections covering data preparation, decoding, encoding, and functional connectivity analysis. It also discusses the use of scikit-learn in combination with other Python libraries such as nilearn to facilitate neuroimaging analysis. The authors conclude that scikit-learn provides a versatile tool for studying the brain and that the integration of machine learning with neuroimaging data analysis can lead to new scientific advances.