Improving Solar Energetic Particle Event Prediction through Multivariate Time Series Data Augmentation

Improving Solar Energetic Particle Event Prediction through Multivariate Time Series Data Augmentation

2024 February | Pouya Hosseinzadeh, Soukaina Filali Boubrahimi, and Shah Muhammad Hamdi
This paper addresses the challenge of predicting solar energetic particle (SEP) events, which are associated with extreme solar events and pose significant risks to astronauts and Earth's infrastructure. The study focuses on improving the prediction accuracy of ~30 MeV, ~60 MeV, and ~100 MeV SEP events by synthetically increasing the number of SEP samples using data augmentation techniques. The authors explore the use of univariate and multivariate time series data of proton flux as input to machine-learning-based prediction methods, such as time series forest (TSF). The research covers solar cycles 22, 23, and 24, and the findings show that data augmentation methods, particularly the synthetic minority oversampling technique (SMOTE), significantly enhance the accuracy and F1-score of the classifiers, with TSF achieving an average accuracy of 90% in the ~100 MeV SEP prediction task. The study also demonstrates higher prediction accuracy when using multivariate time series data of proton flux. Finally, a comprehensive hierarchical classification framework is developed for the best-performing model, and a detailed analysis of the impact of different observation window sizes on classification accuracy is provided. The results highlight the effectiveness of data augmentation and multivariate time series data in improving SEP event prediction.This paper addresses the challenge of predicting solar energetic particle (SEP) events, which are associated with extreme solar events and pose significant risks to astronauts and Earth's infrastructure. The study focuses on improving the prediction accuracy of ~30 MeV, ~60 MeV, and ~100 MeV SEP events by synthetically increasing the number of SEP samples using data augmentation techniques. The authors explore the use of univariate and multivariate time series data of proton flux as input to machine-learning-based prediction methods, such as time series forest (TSF). The research covers solar cycles 22, 23, and 24, and the findings show that data augmentation methods, particularly the synthetic minority oversampling technique (SMOTE), significantly enhance the accuracy and F1-score of the classifiers, with TSF achieving an average accuracy of 90% in the ~100 MeV SEP prediction task. The study also demonstrates higher prediction accuracy when using multivariate time series data of proton flux. Finally, a comprehensive hierarchical classification framework is developed for the best-performing model, and a detailed analysis of the impact of different observation window sizes on classification accuracy is provided. The results highlight the effectiveness of data augmentation and multivariate time series data in improving SEP event prediction.
Reach us at info@study.space