5 Nov 2019 | Candice Bentéjac, Anna Csörgő, Gonzalo Martínez-Muñoz
This paper presents a comparative analysis of XGBoost, a scalable ensemble technique based on gradient boosting, in terms of training speed, generalization performance, and parameter setup. The study also includes a comprehensive comparison with random forests and gradient boosting, using both tuned and default settings. The results indicate that while XGBoost is not always the best choice, it performs well in many scenarios. The paper further analyzes the parameter tuning process for XGBoost, proposing default parameters that improve performance. The analysis suggests that meticulous parameter tuning is necessary for gradient boosting to achieve high accuracy, whereas random forest generally performs well with default settings. The study concludes that XGBoost's default parameters can be effective, and further improvements can be achieved through careful tuning, especially for the randomization parameters. The computational efficiency of XGBoost is highlighted, particularly in terms of training speed, which is significantly faster than random forests and gradient boosting.This paper presents a comparative analysis of XGBoost, a scalable ensemble technique based on gradient boosting, in terms of training speed, generalization performance, and parameter setup. The study also includes a comprehensive comparison with random forests and gradient boosting, using both tuned and default settings. The results indicate that while XGBoost is not always the best choice, it performs well in many scenarios. The paper further analyzes the parameter tuning process for XGBoost, proposing default parameters that improve performance. The analysis suggests that meticulous parameter tuning is necessary for gradient boosting to achieve high accuracy, whereas random forest generally performs well with default settings. The study concludes that XGBoost's default parameters can be effective, and further improvements can be achieved through careful tuning, especially for the randomization parameters. The computational efficiency of XGBoost is highlighted, particularly in terms of training speed, which is significantly faster than random forests and gradient boosting.