January 30, 2024 | J.M. Gorriz, F. Segovia, J Ramírez, A. Ortiz, John Suckling
The paper "Is K-fold Cross Validation the Best Model Selection Method for Machine Learning?" by J.M. Gorriz, E. Segovia, J. Ramirez, A. Ortiz, John Suckling, and others, explores the limitations of K-fold cross-validation (CV) in machine learning (ML) and proposes a novel statistical test, K-fold Cross Upper Bounding Validation (CUBV), to address these issues. The authors argue that while CV is widely used for model selection and performance evaluation, it often fails to provide accurate estimates of the true error, especially in small sample sizes and heterogeneous data. They demonstrate that CV can lead to inflated false positive rates and underestimation of the actual error due to the violation of ergodicity assumptions.
To address these problems, the paper introduces CUBV, which combines K-fold CV with a statistical test based on upper bounding the actual error. This method uses concentration inequalities to bound the deviation between the empirical risk and the actual risk, providing a more robust estimate of the classifier's performance. The authors validate CUBV through simulations and real neuroimaging datasets, showing that it effectively controls false positives and has better detection power compared to traditional CV methods.
The paper concludes that CUBV is a more reliable approach for statistical inference in ML, particularly in neuroimaging and other fields where data heterogeneity and small sample sizes are common. The proposed method provides a more accurate assessment of model performance and helps avoid the pitfalls of overfitting and false positives.The paper "Is K-fold Cross Validation the Best Model Selection Method for Machine Learning?" by J.M. Gorriz, E. Segovia, J. Ramirez, A. Ortiz, John Suckling, and others, explores the limitations of K-fold cross-validation (CV) in machine learning (ML) and proposes a novel statistical test, K-fold Cross Upper Bounding Validation (CUBV), to address these issues. The authors argue that while CV is widely used for model selection and performance evaluation, it often fails to provide accurate estimates of the true error, especially in small sample sizes and heterogeneous data. They demonstrate that CV can lead to inflated false positive rates and underestimation of the actual error due to the violation of ergodicity assumptions.
To address these problems, the paper introduces CUBV, which combines K-fold CV with a statistical test based on upper bounding the actual error. This method uses concentration inequalities to bound the deviation between the empirical risk and the actual risk, providing a more robust estimate of the classifier's performance. The authors validate CUBV through simulations and real neuroimaging datasets, showing that it effectively controls false positives and has better detection power compared to traditional CV methods.
The paper concludes that CUBV is a more reliable approach for statistical inference in ML, particularly in neuroimaging and other fields where data heterogeneity and small sample sizes are common. The proposed method provides a more accurate assessment of model performance and helps avoid the pitfalls of overfitting and false positives.