07 May 2024 | Syed Awais W. Shah, Daniel P. Palomar, Ian Barr, Leo L. M. Poon, Ahmed Abdul Quadeer & Matthew R. McKay
A machine learning model is developed to predict the antigenic properties of influenza A H3N2 viruses using their HA1 sequences and associated metadata. The model is trained on data from past seasons and can predict HI titers for virus-antiserum pairs, distinguishing antigenic variants from non-variants. It accurately captures the nonlinear relationship between genetic and antigenic changes in H3N2 viruses, providing insights into antigenic drift and seasonal dynamics. The model's performance is robust, with a mean absolute error (MAE) of 0.702 antigenic units per season. It outperforms alternative models, including linear and neural network approaches, and is effective even with limited training data. The model identifies key HA1 sites influencing antigenic changes, revealing their seasonal variation. It also demonstrates the ability to predict antigenic variants, with an average area under the receiver operating characteristic (AUROC) of 92% across seasons. The model is implemented as a web application for predicting NHTs of user-specified H3N2 virus-antiserum pairs. The model's performance is validated across 14 test seasons, showing strong accuracy and robustness. It is also adapted for H1N1 influenza, achieving an average MAE of 0.747 over 18 seasons. The model provides valuable insights into influenza evolution and supports improved vaccine strain selection and public health management. The model's approach offers a data-driven method for seasonal antigenic characterization, complementing experimental data and enhancing influenza surveillance.A machine learning model is developed to predict the antigenic properties of influenza A H3N2 viruses using their HA1 sequences and associated metadata. The model is trained on data from past seasons and can predict HI titers for virus-antiserum pairs, distinguishing antigenic variants from non-variants. It accurately captures the nonlinear relationship between genetic and antigenic changes in H3N2 viruses, providing insights into antigenic drift and seasonal dynamics. The model's performance is robust, with a mean absolute error (MAE) of 0.702 antigenic units per season. It outperforms alternative models, including linear and neural network approaches, and is effective even with limited training data. The model identifies key HA1 sites influencing antigenic changes, revealing their seasonal variation. It also demonstrates the ability to predict antigenic variants, with an average area under the receiver operating characteristic (AUROC) of 92% across seasons. The model is implemented as a web application for predicting NHTs of user-specified H3N2 virus-antiserum pairs. The model's performance is validated across 14 test seasons, showing strong accuracy and robustness. It is also adapted for H1N1 influenza, achieving an average MAE of 0.747 over 18 seasons. The model provides valuable insights into influenza evolution and supports improved vaccine strain selection and public health management. The model's approach offers a data-driven method for seasonal antigenic characterization, complementing experimental data and enhancing influenza surveillance.