November 24, 2021 | Ravid Shwartz-Ziv, Amitai Armon
This paper explores the performance of deep learning models for tabular data compared to traditional tree ensemble models, specifically XGBoost. The authors rigorously compare these models on various datasets, evaluating their accuracy, tuning requirements, and computational efficiency. The study finds that XGBoost outperforms deep models across most datasets, including those not used in the papers proposing the deep models. Additionally, XGBoost requires less tuning and hyperparameter optimization. The authors also demonstrate that an ensemble of deep models combined with XGBoost performs better than either model alone. The paper concludes that while deep learning has shown promise, it is not currently a superior choice for tabular data problems, and further research is needed to improve deep models' performance and interpretability.This paper explores the performance of deep learning models for tabular data compared to traditional tree ensemble models, specifically XGBoost. The authors rigorously compare these models on various datasets, evaluating their accuracy, tuning requirements, and computational efficiency. The study finds that XGBoost outperforms deep models across most datasets, including those not used in the papers proposing the deep models. Additionally, XGBoost requires less tuning and hyperparameter optimization. The authors also demonstrate that an ensemble of deep models combined with XGBoost performs better than either model alone. The paper concludes that while deep learning has shown promise, it is not currently a superior choice for tabular data problems, and further research is needed to improve deep models' performance and interpretability.