23 February 2006 | Sudhir Varma*† and Richard Simon†
This article by Sudhir Varma and Richard Simon evaluates the bias in error estimation when using cross-validation (CV) for model selection. They focus on two classifiers: Shrunken Centroids and Support Vector Machines (SVM). The authors create random training datasets with no differential expression between classes to optimize classifier parameters using CV. They find that the CV error estimate for the optimized classifier is significantly biased, often underestimating the true error on independent test data. For Shrunken Centroids, the CV error estimate was less than 30% on 18.5% of simulated datasets, and for SVM, it was less than 30% on 38% of "null" datasets. The performance of these optimized classifiers on independent test sets was no better than chance.
To address this bias, the authors propose a nested CV procedure where an inner CV loop tunes the parameters while an outer CV loop estimates the error. This nested CV approach significantly reduces the bias and provides an almost unbiased estimate of the true error for both Shrunken Centroids and SVM classifiers, even on "null" and "non-null" data distributions.
The conclusion emphasizes that using CV to compute error estimates for classifiers optimized using CV can lead to biased estimates of the true error. Proper use of CV requires that all steps of the algorithm, including parameter tuning, be repeated within each CV loop. The nested CV procedure is recommended for obtaining unbiased error estimates.This article by Sudhir Varma and Richard Simon evaluates the bias in error estimation when using cross-validation (CV) for model selection. They focus on two classifiers: Shrunken Centroids and Support Vector Machines (SVM). The authors create random training datasets with no differential expression between classes to optimize classifier parameters using CV. They find that the CV error estimate for the optimized classifier is significantly biased, often underestimating the true error on independent test data. For Shrunken Centroids, the CV error estimate was less than 30% on 18.5% of simulated datasets, and for SVM, it was less than 30% on 38% of "null" datasets. The performance of these optimized classifiers on independent test sets was no better than chance.
To address this bias, the authors propose a nested CV procedure where an inner CV loop tunes the parameters while an outer CV loop estimates the error. This nested CV approach significantly reduces the bias and provides an almost unbiased estimate of the true error for both Shrunken Centroids and SVM classifiers, even on "null" and "non-null" data distributions.
The conclusion emphasizes that using CV to compute error estimates for classifiers optimized using CV can lead to biased estimates of the true error. Proper use of CV requires that all steps of the algorithm, including parameter tuning, be repeated within each CV loop. The nested CV procedure is recommended for obtaining unbiased error estimates.