Class overlap handling methods in imbalanced domain: A comprehensive survey

Class overlap handling methods in imbalanced domain: A comprehensive survey

11 January 2024 | Anil Kumar, Dinesh Singh, Rama Shankar Yadav
Class overlap in imbalanced datasets is a significant challenge for researchers in deep learning (DL), machine learning (ML), and big data (BD) applications. This paper provides a comprehensive survey of methods to handle class overlap, which negatively affects the performance of classification models due to the intrinsic characteristics of imbalanced and overlapping data. The survey categorizes existing solutions into data-level, algorithm-level, ensemble, and hybrid methods. Data-level methods alter the distribution of class instances, often leading to information loss and overfitting. Algorithm-level methods modify the model structure to give more weight to misclassified minority class instances. However, these changes can be less user-friendly. The paper discusses the advantages, disadvantages, limitations, and key performance metrics of various methods, highlighting recent advancements and research gaps. It also addresses the importance of handling class overlap in various real-world applications, such as medical science, anomaly detection, and financial sector analysis. The survey is structured to provide a detailed comparative analysis of recent methods, offering insights into future research directions in ML, DL, and BD.Class overlap in imbalanced datasets is a significant challenge for researchers in deep learning (DL), machine learning (ML), and big data (BD) applications. This paper provides a comprehensive survey of methods to handle class overlap, which negatively affects the performance of classification models due to the intrinsic characteristics of imbalanced and overlapping data. The survey categorizes existing solutions into data-level, algorithm-level, ensemble, and hybrid methods. Data-level methods alter the distribution of class instances, often leading to information loss and overfitting. Algorithm-level methods modify the model structure to give more weight to misclassified minority class instances. However, these changes can be less user-friendly. The paper discusses the advantages, disadvantages, limitations, and key performance metrics of various methods, highlighting recent advancements and research gaps. It also addresses the importance of handling class overlap in various real-world applications, such as medical science, anomaly detection, and financial sector analysis. The survey is structured to provide a detailed comparative analysis of recent methods, offering insights into future research directions in ML, DL, and BD.
Reach us at info@study.space