2024 | Teuku Rizky Noviandy, Ghalieb Mutig Idroes, Irsan Hardi, Mohd Afjal and Samrat Ray
This study investigates the use of machine learning models and SHAP analysis to predict customer churn in the telecommunications industry. The research applies five models—Naïve Bayes, Random Forest, AdaBoost, XGBoost, and LightGBM—to a dataset of 7,043 customers, aiming to identify key factors influencing churn and enhance model interpretability. LightGBM achieved the highest accuracy (80.70%), precision (84.35%), recall (90.54%), and F1-score (87.34%), outperforming the other models. SHAP analysis revealed that features such as tenure, contract type, and monthly charges are significant predictors of churn. These findings highlight the importance of combining predictive analytics with interpretability methods to develop effective retention strategies for telecom companies.
The study emphasizes the need for transparent and accurate models to understand customer behavior and improve customer satisfaction and loyalty. While the results are promising, the study has limitations, including the use of a fictional dataset and potential unaccounted feature interactions. Future research should validate these findings with real-world data, explore more sophisticated models, and incorporate temporal dynamics to enhance churn prediction models.
The results demonstrate that LightGBM is an effective model for predicting customer churn, with SHAP analysis providing actionable insights into the factors driving churn. The study underscores the value of model interpretability in building trust and transparency in decision-making processes, enabling telecom companies to implement targeted strategies to reduce churn and improve customer retention.This study investigates the use of machine learning models and SHAP analysis to predict customer churn in the telecommunications industry. The research applies five models—Naïve Bayes, Random Forest, AdaBoost, XGBoost, and LightGBM—to a dataset of 7,043 customers, aiming to identify key factors influencing churn and enhance model interpretability. LightGBM achieved the highest accuracy (80.70%), precision (84.35%), recall (90.54%), and F1-score (87.34%), outperforming the other models. SHAP analysis revealed that features such as tenure, contract type, and monthly charges are significant predictors of churn. These findings highlight the importance of combining predictive analytics with interpretability methods to develop effective retention strategies for telecom companies.
The study emphasizes the need for transparent and accurate models to understand customer behavior and improve customer satisfaction and loyalty. While the results are promising, the study has limitations, including the use of a fictional dataset and potential unaccounted feature interactions. Future research should validate these findings with real-world data, explore more sophisticated models, and incorporate temporal dynamics to enhance churn prediction models.
The results demonstrate that LightGBM is an effective model for predicting customer churn, with SHAP analysis providing actionable insights into the factors driving churn. The study underscores the value of model interpretability in building trust and transparency in decision-making processes, enabling telecom companies to implement targeted strategies to reduce churn and improve customer retention.