February 23, 2000 | Pierre Baldi, Søren Brunak, Yves Chauvin, Claus A. F. Andersen and Henrik Nielsen
This paper provides an overview of methods for assessing the accuracy of prediction algorithms in classification tasks. It discusses various performance measures, including percentages, Hamming distance, quadratic distance, correlation coefficient, relative entropy, and mutual information. The paper emphasizes the importance of selecting appropriate measures based on the problem at hand and highlights the advantages and disadvantages of each approach. For classification tasks, the authors derive new learning algorithms that optimize the correlation coefficient, leading to improved prediction performance. They also discuss the relationship between sensitivity and specificity in optimal systems. The paper illustrates these concepts using examples from protein secondary structure prediction and signal peptide prediction. It concludes that while several measures exist, the correlation coefficient and mutual information coefficient provide a more balanced evaluation of prediction accuracy. The paper also addresses the challenges of evaluating prediction performance in multi-class problems and discusses the importance of probabilistic models in learning algorithms. The authors emphasize the need for careful selection of performance measures and the impact of these choices on the learning process.This paper provides an overview of methods for assessing the accuracy of prediction algorithms in classification tasks. It discusses various performance measures, including percentages, Hamming distance, quadratic distance, correlation coefficient, relative entropy, and mutual information. The paper emphasizes the importance of selecting appropriate measures based on the problem at hand and highlights the advantages and disadvantages of each approach. For classification tasks, the authors derive new learning algorithms that optimize the correlation coefficient, leading to improved prediction performance. They also discuss the relationship between sensitivity and specificity in optimal systems. The paper illustrates these concepts using examples from protein secondary structure prediction and signal peptide prediction. It concludes that while several measures exist, the correlation coefficient and mutual information coefficient provide a more balanced evaluation of prediction accuracy. The paper also addresses the challenges of evaluating prediction performance in multi-class problems and discusses the importance of probabilistic models in learning algorithms. The authors emphasize the need for careful selection of performance measures and the impact of these choices on the learning process.