Available online 1 February 2024 | Mimusa Azim Mim, Nazia Majadi*, Peal Mazumder
This research proposes a soft voting ensemble learning approach for detecting credit card fraud in imbalanced datasets. The study evaluates and compares various sampling techniques (oversampling, undersampling, and hybrid sampling) to address class imbalance. Several credit card fraud classifiers, including ensemble classifiers with and without sampling, are developed. The proposed soft-voting approach outperforms individual classifiers, achieving a precision of 0.9870, recall of 0.9694, F1-score of 0.8764, and AUROC of 0.9936. The system architecture includes data preprocessing, feature scaling, and sampling techniques. The soft voting ensemble learning approach combines multiple classifiers (e.g., XGBoost, MLP, KNN) to enhance performance. The study also discusses various machine learning algorithms, including logistic regression, random forest, XGBoost, SVM, AdaBoost, and KNN. Experimental results show that the proposed approach achieves high accuracy in detecting fraudulent transactions, with the best performance on the under-sampled dataset. The study emphasizes the importance of recall in fraud detection, as it indicates the model's ability to identify actual fraud cases. The results demonstrate that the soft voting ensemble approach significantly improves performance compared to individual classifiers. Statistical tests confirm the significance of the proposed method's performance improvements. The study contributes to the field of credit card fraud detection by proposing an effective ensemble learning approach that addresses class imbalance and enhances detection accuracy.This research proposes a soft voting ensemble learning approach for detecting credit card fraud in imbalanced datasets. The study evaluates and compares various sampling techniques (oversampling, undersampling, and hybrid sampling) to address class imbalance. Several credit card fraud classifiers, including ensemble classifiers with and without sampling, are developed. The proposed soft-voting approach outperforms individual classifiers, achieving a precision of 0.9870, recall of 0.9694, F1-score of 0.8764, and AUROC of 0.9936. The system architecture includes data preprocessing, feature scaling, and sampling techniques. The soft voting ensemble learning approach combines multiple classifiers (e.g., XGBoost, MLP, KNN) to enhance performance. The study also discusses various machine learning algorithms, including logistic regression, random forest, XGBoost, SVM, AdaBoost, and KNN. Experimental results show that the proposed approach achieves high accuracy in detecting fraudulent transactions, with the best performance on the under-sampled dataset. The study emphasizes the importance of recall in fraud detection, as it indicates the model's ability to identify actual fraud cases. The results demonstrate that the soft voting ensemble approach significantly improves performance compared to individual classifiers. Statistical tests confirm the significance of the proposed method's performance improvements. The study contributes to the field of credit card fraud detection by proposing an effective ensemble learning approach that addresses class imbalance and enhances detection accuracy.