This paper presents a controlled study to evaluate the robustness and performance of five text categorization methods: Support Vector Machines (SVM), k-Nearest Neighbor (kNN), Neural Network (NNet), Linear Least-squares Fit (LLSF), and Naïve Bayes (NB). The study focuses on how these methods handle skewed category distributions and their performance as a function of training-set category frequency. Key findings include:
1. **Statistical Significance Tests**: The authors propose various statistical tests (sign test, t-test, p-test) to compare the performance of different methods, considering both micro-level (pooled decisions) and macro-level (category-level) performance measures.
2. **Performance Analysis**:
- **Micro-Level**: SVM and kNN significantly outperform LLSF, NNet, and NB.
- **Macro-Level**: SVM, kNN, and LLSF perform significantly better than NB and NNet.
3. **Category Frequency Impact**:
- **Small Training Instances**: When the number of positive training instances per category is small (less than ten), SVM, kNN, and LLSF outperform NNet and NB.
- **Large Training Instances**: When categories are sufficiently common (over 300 instances), all methods perform comparably.
4. **Conclusion**:
- SVM, kNN, and LLSF are robust and perform well across different category frequencies.
- NB and NNet underperform, especially in the presence of rare categories.
The study provides a comprehensive comparison and highlights the strengths and weaknesses of each method, offering insights into their effectiveness in handling skewed category distributions.This paper presents a controlled study to evaluate the robustness and performance of five text categorization methods: Support Vector Machines (SVM), k-Nearest Neighbor (kNN), Neural Network (NNet), Linear Least-squares Fit (LLSF), and Naïve Bayes (NB). The study focuses on how these methods handle skewed category distributions and their performance as a function of training-set category frequency. Key findings include:
1. **Statistical Significance Tests**: The authors propose various statistical tests (sign test, t-test, p-test) to compare the performance of different methods, considering both micro-level (pooled decisions) and macro-level (category-level) performance measures.
2. **Performance Analysis**:
- **Micro-Level**: SVM and kNN significantly outperform LLSF, NNet, and NB.
- **Macro-Level**: SVM, kNN, and LLSF perform significantly better than NB and NNet.
3. **Category Frequency Impact**:
- **Small Training Instances**: When the number of positive training instances per category is small (less than ten), SVM, kNN, and LLSF outperform NNet and NB.
- **Large Training Instances**: When categories are sufficiently common (over 300 instances), all methods perform comparably.
4. **Conclusion**:
- SVM, kNN, and LLSF are robust and perform well across different category frequencies.
- NB and NNet underperform, especially in the presence of rare categories.
The study provides a comprehensive comparison and highlights the strengths and weaknesses of each method, offering insights into their effectiveness in handling skewed category distributions.