This paper compares two feature selection methods: concave minimization and support vector machines (SVMs) for finding a separating plane that discriminates between two point sets in n-dimensional space with as few features as possible. The concave minimization approach minimizes a weighted sum of distances of misclassified points to two parallel bounding planes, while also minimizing the number of dimensions used. The SVM approach, in addition to minimizing the weighted sum of distances, maximizes the distance between the bounding planes. Computational results show that the concave minimization approach selects fewer problem features than SVMs on six public data sets, while both methods have comparable 10-fold cross-validation correctness. The SVM approach with the 1-norm leads to a feature selection method, whereas the 2-norm and ∞-norm do not. The concave minimization approach is solved using a successive linearization algorithm, which efficiently finds a sparse solution with good generalization properties. The SVM with the 1-norm is formulated as a linear program, while the 2-norm is a quadratic program. The SVM with the ∞-norm does not result in feature selection. The paper also discusses the computational efficiency of the methods, noting that the linear programming approaches are significantly faster than the quadratic programming approach for SVMs. The results indicate that the concave minimization approach and the SVM with the 1-norm have similar generalization performance, with the former selecting fewer features. The paper concludes that further research is needed to explore the benefits of using dual norms for feature selection and to determine which norm is best suited for different data sets.This paper compares two feature selection methods: concave minimization and support vector machines (SVMs) for finding a separating plane that discriminates between two point sets in n-dimensional space with as few features as possible. The concave minimization approach minimizes a weighted sum of distances of misclassified points to two parallel bounding planes, while also minimizing the number of dimensions used. The SVM approach, in addition to minimizing the weighted sum of distances, maximizes the distance between the bounding planes. Computational results show that the concave minimization approach selects fewer problem features than SVMs on six public data sets, while both methods have comparable 10-fold cross-validation correctness. The SVM approach with the 1-norm leads to a feature selection method, whereas the 2-norm and ∞-norm do not. The concave minimization approach is solved using a successive linearization algorithm, which efficiently finds a sparse solution with good generalization properties. The SVM with the 1-norm is formulated as a linear program, while the 2-norm is a quadratic program. The SVM with the ∞-norm does not result in feature selection. The paper also discusses the computational efficiency of the methods, noting that the linear programming approaches are significantly faster than the quadratic programming approach for SVMs. The results indicate that the concave minimization approach and the SVM with the 1-norm have similar generalization performance, with the former selecting fewer features. The paper concludes that further research is needed to explore the benefits of using dual norms for feature selection and to determine which norm is best suited for different data sets.