The impact of Bayesian optimization on feature selection

The impact of Bayesian optimization on feature selection

2024 | Kaixin Yang, Long Liu & Yalu Wen
Bayesian optimization enhances feature selection methods, particularly when hyperparameters need tuning. This study compares various feature selection methods using simulation and real-world data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Bayesian optimization improves recall rates in feature selection, especially for high-dimensional data. It also enhances the accuracy of disease risk prediction models by optimizing hyperparameters in methods like Lasso, Enet, and XGBoost. The study shows that Bayesian optimization can significantly improve performance in both simulated and real-world scenarios. Feature selection is crucial for high-dimensional molecular data to avoid overfitting and reduce computational costs. Existing methods include filter-based (e.g., SIS, MRMR), wrapper-based (e.g., sPLSda), and embedded-based (e.g., Lasso, Enet, XGBoost) approaches. Bayesian optimization is effective in tuning hyperparameters for these methods, leading to better model performance. The study demonstrates that Bayesian optimization can improve the accuracy of predictive models for AD-related phenotypes, such as subcortical volumes and gray matter. It also highlights the importance of hyperparameter tuning in feature selection methods and the potential of Bayesian optimization to enhance downstream tasks. The research concludes that Bayesian optimization is a valuable tool for improving feature selection and prediction accuracy in high-dimensional data analysis.Bayesian optimization enhances feature selection methods, particularly when hyperparameters need tuning. This study compares various feature selection methods using simulation and real-world data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Bayesian optimization improves recall rates in feature selection, especially for high-dimensional data. It also enhances the accuracy of disease risk prediction models by optimizing hyperparameters in methods like Lasso, Enet, and XGBoost. The study shows that Bayesian optimization can significantly improve performance in both simulated and real-world scenarios. Feature selection is crucial for high-dimensional molecular data to avoid overfitting and reduce computational costs. Existing methods include filter-based (e.g., SIS, MRMR), wrapper-based (e.g., sPLSda), and embedded-based (e.g., Lasso, Enet, XGBoost) approaches. Bayesian optimization is effective in tuning hyperparameters for these methods, leading to better model performance. The study demonstrates that Bayesian optimization can improve the accuracy of predictive models for AD-related phenotypes, such as subcortical volumes and gray matter. It also highlights the importance of hyperparameter tuning in feature selection methods and the potential of Bayesian optimization to enhance downstream tasks. The research concludes that Bayesian optimization is a valuable tool for improving feature selection and prediction accuracy in high-dimensional data analysis.
Reach us at info@study.space