Ensemble Methodology: Innovations in Credit Default Prediction Using LightGBM, XGBoost, and LocalEnsemble

Ensemble Methodology: Innovations in Credit Default Prediction Using LightGBM, XGBoost, and LocalEnsemble

28 Feb 2024 | Mengran Zhu, Ye Zhang, Yulu Gong, Kaijuan Xing, Xu Yan, Jintong Song
This paper presents an innovative ensemble method for credit default prediction using LightGBM, XGBoost, and LocalEnsemble. The study addresses the challenge of accurately predicting credit card defaults in consumer lending, aiming to improve risk mitigation and lending decision-making. The proposed framework integrates three key modules: LightGBM, XGBoost, and LocalEnsemble, each contributing unique strengths to enhance model diversity and generalization. The LightGBM module focuses on efficient gradient boosting with the GOSS algorithm, while the XGBoost module utilizes a broader range of feature engineering and out-of-fold predictions. The LocalEnsemble module enhances accuracy by modeling different feature combinations. The methodology includes data preprocessing, feature engineering, and the development of an ensemble model. Data preprocessing involves noise removal, type conversion, and outlier handling. Feature engineering generates aggregated, lag, and meta features to capture user behavior patterns. The ensemble model combines the predictions of the three modules using weighted averaging to achieve better performance. The experiments were conducted on the American Express credit default prediction dataset, which includes 900,000 customers and 11 million records. The proposed ensemble model outperformed other models in both public and private datasets, achieving the highest scores. The model's performance was evaluated using the Normalized Gini Coefficient and the default rate captured at 4%, which together provide a comprehensive assessment of model effectiveness. The study also analyzed feature importance, revealing that the top features significantly contributed to model performance. The ensemble approach not only improves prediction accuracy but also enhances model interpretability and robustness. The results demonstrate the effectiveness of the proposed framework in credit default prediction, setting a new benchmark for the industry.This paper presents an innovative ensemble method for credit default prediction using LightGBM, XGBoost, and LocalEnsemble. The study addresses the challenge of accurately predicting credit card defaults in consumer lending, aiming to improve risk mitigation and lending decision-making. The proposed framework integrates three key modules: LightGBM, XGBoost, and LocalEnsemble, each contributing unique strengths to enhance model diversity and generalization. The LightGBM module focuses on efficient gradient boosting with the GOSS algorithm, while the XGBoost module utilizes a broader range of feature engineering and out-of-fold predictions. The LocalEnsemble module enhances accuracy by modeling different feature combinations. The methodology includes data preprocessing, feature engineering, and the development of an ensemble model. Data preprocessing involves noise removal, type conversion, and outlier handling. Feature engineering generates aggregated, lag, and meta features to capture user behavior patterns. The ensemble model combines the predictions of the three modules using weighted averaging to achieve better performance. The experiments were conducted on the American Express credit default prediction dataset, which includes 900,000 customers and 11 million records. The proposed ensemble model outperformed other models in both public and private datasets, achieving the highest scores. The model's performance was evaluated using the Normalized Gini Coefficient and the default rate captured at 4%, which together provide a comprehensive assessment of model effectiveness. The study also analyzed feature importance, revealing that the top features significantly contributed to model performance. The ensemble approach not only improves prediction accuracy but also enhances model interpretability and robustness. The results demonstrate the effectiveness of the proposed framework in credit default prediction, setting a new benchmark for the industry.
Reach us at info@study.space