April 4, 2018 | Ryan J. Urbanowicz, Melissa Meeker, William LaCava, Randal S. Olson, Jason H. Moore
Relief-based feature selection (RBAs) is a filter-style method that balances computational efficiency with sensitivity to complex feature associations, such as interactions. This review introduces RBAs, focusing on their ability to detect feature interactions without explicitly evaluating feature subsets. RBAs are particularly useful in biomedical data mining due to their efficiency and adaptability to various data types, including classification and regression. The review covers the original Relief algorithm, its derivatives like ReliefF, and various branches of RBA research. It discusses the strengths and limitations of RBAs, including their ability to detect interactions, handle missing data, and manage multi-class endpoints. The review also addresses the importance of feature selection in reducing dimensionality and improving model performance. Key concepts include feature weights, which reflect the relevance of features, and the use of nearest neighbors to estimate feature importance. The review highlights the computational efficiency of RBAs, their ability to handle different data types, and their adaptability to various modeling algorithms. It also discusses the challenges of feature selection, such as redundancy and the need for careful thresholding to avoid removing relevant features. The review concludes with a summary of key RBA algorithms, their time complexities, and their applications in different domains.Relief-based feature selection (RBAs) is a filter-style method that balances computational efficiency with sensitivity to complex feature associations, such as interactions. This review introduces RBAs, focusing on their ability to detect feature interactions without explicitly evaluating feature subsets. RBAs are particularly useful in biomedical data mining due to their efficiency and adaptability to various data types, including classification and regression. The review covers the original Relief algorithm, its derivatives like ReliefF, and various branches of RBA research. It discusses the strengths and limitations of RBAs, including their ability to detect interactions, handle missing data, and manage multi-class endpoints. The review also addresses the importance of feature selection in reducing dimensionality and improving model performance. Key concepts include feature weights, which reflect the relevance of features, and the use of nearest neighbors to estimate feature importance. The review highlights the computational efficiency of RBAs, their ability to handle different data types, and their adaptability to various modeling algorithms. It also discusses the challenges of feature selection, such as redundancy and the need for careful thresholding to avoid removing relevant features. The review concludes with a summary of key RBA algorithms, their time complexities, and their applications in different domains.