2002 | OLIVIER CHAPELLE, VLADIMIR VAPNIK, OLIVIER BOUSQUET, SAYAN MUKHERJEE
This paper presents a method for automatically tuning multiple parameters in Support Vector Machines (SVMs) to improve generalization performance. The approach involves minimizing estimates of the generalization error using a gradient descent algorithm over the set of parameters. Traditional methods for parameter selection, such as exhaustive search, become intractable when the number of parameters exceeds two. The proposed method is tested on a large number of parameters (more than 100) and demonstrates improved generalization performance.
The paper discusses the problem of supervised learning and the role of SVMs in pattern recognition. It explains how SVMs work by mapping input vectors into a high-dimensional feature space and constructing an optimal separating hyperplane. The method of tuning parameters is based on the idea of maximizing the margin between classes and minimizing the generalization error. The paper also discusses various error estimates, including validation error and leave-one-out error, and how they can be smoothed for gradient descent optimization.
The paper introduces a framework for minimizing error estimates using gradient descent. It describes how to compute gradients of error estimates with respect to kernel parameters and presents experimental results on various databases. The experiments show that the proposed method is effective in finding optimal parameters for SVMs, particularly for kernel selection and feature selection. The method is demonstrated on a variety of tasks, including handwritten digit recognition, and shows that optimizing scaling factors leads to feature selection. The results indicate that the method is computationally efficient and effective in improving generalization performance.This paper presents a method for automatically tuning multiple parameters in Support Vector Machines (SVMs) to improve generalization performance. The approach involves minimizing estimates of the generalization error using a gradient descent algorithm over the set of parameters. Traditional methods for parameter selection, such as exhaustive search, become intractable when the number of parameters exceeds two. The proposed method is tested on a large number of parameters (more than 100) and demonstrates improved generalization performance.
The paper discusses the problem of supervised learning and the role of SVMs in pattern recognition. It explains how SVMs work by mapping input vectors into a high-dimensional feature space and constructing an optimal separating hyperplane. The method of tuning parameters is based on the idea of maximizing the margin between classes and minimizing the generalization error. The paper also discusses various error estimates, including validation error and leave-one-out error, and how they can be smoothed for gradient descent optimization.
The paper introduces a framework for minimizing error estimates using gradient descent. It describes how to compute gradients of error estimates with respect to kernel parameters and presents experimental results on various databases. The experiments show that the proposed method is effective in finding optimal parameters for SVMs, particularly for kernel selection and feature selection. The method is demonstrated on a variety of tasks, including handwritten digit recognition, and shows that optimizing scaling factors leads to feature selection. The results indicate that the method is computationally efficient and effective in improving generalization performance.