Understanding A soft voting ensemble learning approach for credit card fraud detection

This paper presents a soft voting ensemble learning approach for detecting credit card fraud, addressing the challenges of class imbalance in the dataset. The study evaluates various machine learning algorithms, including Logistic Regression, Random Forest, XGBoost, Support Vector Machine, Adaptive Boosting, Stochastic Gradient Descent, Multilayer Perceptron, Decision Tree, Gaussian Naive Bayes, and K-Nearest Neighbor, both with and without sampling techniques. The proposed soft voting ensemble approach combines the predictions of multiple classifiers to achieve higher classification accuracy. The experimental results show that the soft voting ensemble approach outperforms individual classifiers in terms of precision, recall, F1-score, and AUROC, achieving a false negative rate (FNR) of 0.036, precision of 0.9870, recall of 0.9694, F1-score of 0.8764, and AUROC of 0.9936. The study also compares the proposed approach with other sampling techniques (oversampling, undersampling, and hybrid sampling) and finds that it performs better in handling imbalanced datasets. The findings highlight the importance of considering multiple performance metrics, such as precision, recall, F1-score, AUROC, and FNR, in evaluating fraud detection models, especially in the context of class imbalance. The study concludes by discussing the limitations and future directions, including the need for more recent datasets and the integration of deep neural networks.This paper presents a soft voting ensemble learning approach for detecting credit card fraud, addressing the challenges of class imbalance in the dataset. The study evaluates various machine learning algorithms, including Logistic Regression, Random Forest, XGBoost, Support Vector Machine, Adaptive Boosting, Stochastic Gradient Descent, Multilayer Perceptron, Decision Tree, Gaussian Naive Bayes, and K-Nearest Neighbor, both with and without sampling techniques. The proposed soft voting ensemble approach combines the predictions of multiple classifiers to achieve higher classification accuracy. The experimental results show that the soft voting ensemble approach outperforms individual classifiers in terms of precision, recall, F1-score, and AUROC, achieving a false negative rate (FNR) of 0.036, precision of 0.9870, recall of 0.9694, F1-score of 0.8764, and AUROC of 0.9936. The study also compares the proposed approach with other sampling techniques (oversampling, undersampling, and hybrid sampling) and finds that it performs better in handling imbalanced datasets. The findings highlight the importance of considering multiple performance metrics, such as precision, recall, F1-score, AUROC, and FNR, in evaluating fraud detection models, especially in the context of class imbalance. The study concludes by discussing the limitations and future directions, including the need for more recent datasets and the integration of deep neural networks.

A soft voting ensemble learning approach for credit card fraud detection

27 January 2024 | Mimusa Azim Mim, Nazia Majadi*, Peal Mazumder