A Support Vector Method for Multivariate Performance Measures

A Support Vector Method for Multivariate Performance Measures

2005 | Thorsten Joachims
This paper introduces a Support Vector Method (SVM) for optimizing multivariate nonlinear performance measures such as the F1-score. The method extends traditional SVMs to handle performance measures that are not linear and cannot be decomposed into individual example losses. It addresses the challenge of directly optimizing performance measures like ROCArea, Precision/Recall Breakeven Point (PRBEP), Precision at k (Prec@k), and others that depend on the entire dataset. Traditional SVMs optimize error rate, which may not align with application-specific performance measures. The proposed method formulates the learning problem as a multivariate prediction task, where the model predicts all examples in the dataset simultaneously. This approach allows for the optimization of performance measures that are non-linear and multivariate, such as F1-score, by using a sparse approximation algorithm for structured SVMs. The method is shown to be computationally tractable and can be applied to a wide range of performance measures that can be computed from the contingency table. The paper demonstrates that the conventional SVM is a special case of this new method when using error rate as the performance measure. Experiments show that the new method outperforms traditional SVMs in tasks with highly imbalanced classes, particularly in text classification. The method is efficient and avoids the need for heuristic adjustments, providing a principled approach to optimizing multivariate performance measures. The algorithm is implemented and tested on various datasets, showing significant improvements in performance for measures like F1-score, while other measures like ROCArea and PRBEP also show consistent gains. The results indicate that the new method is effective in handling complex performance measures that are not easily optimized by traditional learning algorithms.This paper introduces a Support Vector Method (SVM) for optimizing multivariate nonlinear performance measures such as the F1-score. The method extends traditional SVMs to handle performance measures that are not linear and cannot be decomposed into individual example losses. It addresses the challenge of directly optimizing performance measures like ROCArea, Precision/Recall Breakeven Point (PRBEP), Precision at k (Prec@k), and others that depend on the entire dataset. Traditional SVMs optimize error rate, which may not align with application-specific performance measures. The proposed method formulates the learning problem as a multivariate prediction task, where the model predicts all examples in the dataset simultaneously. This approach allows for the optimization of performance measures that are non-linear and multivariate, such as F1-score, by using a sparse approximation algorithm for structured SVMs. The method is shown to be computationally tractable and can be applied to a wide range of performance measures that can be computed from the contingency table. The paper demonstrates that the conventional SVM is a special case of this new method when using error rate as the performance measure. Experiments show that the new method outperforms traditional SVMs in tasks with highly imbalanced classes, particularly in text classification. The method is efficient and avoids the need for heuristic adjustments, providing a principled approach to optimizing multivariate performance measures. The algorithm is implemented and tested on various datasets, showing significant improvements in performance for measures like F1-score, while other measures like ROCArea and PRBEP also show consistent gains. The results indicate that the new method is effective in handling complex performance measures that are not easily optimized by traditional learning algorithms.
Reach us at info@study.space
[slides] A support vector method for multivariate performance measures | StudySpace