28 May 2024 | Cristian Bodnar, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Johannes Brandstetter, Patrick Garvan, Maik Reichert, Jonathan Weyn, Haiyu Dong, Anna Vaughan, Jayesh K. Gupta, Kit Tambiratnam, Alex Archibald, Elizabeth Heider, Max Welling, Richard E. Turner, and Paris Perdikaris
Aurora is a large-scale foundation model of the atmosphere trained on over a million hours of weather and climate data. It leverages the strengths of foundation modeling to produce operational forecasts for a wide variety of atmospheric prediction problems, including those with limited training data, heterogeneous variables, and extreme events. Aurora produces 5-day global air pollution predictions and 10-day high-resolution weather forecasts that outperform state-of-the-art classical simulation tools and the best specialized deep learning models. These results indicate that foundation models can transform environmental forecasting.
Aurora is a flexible 3D foundation model for high-resolution forecasting of weather and atmospheric processes. It consists of an encoder, processor, and decoder. The model is pretrained on a mixture of six weather and climate datasets, including ERA5, CMCC, IFS-HR, HRES Forecasts, GFS Analysis, and GFS Forecasts. It is then fine-tuned in two stages: short-lead time fine-tuning of the pretrained weights and long-lead time (rollout) fine-tuning using Low Rank Adaptation (LoRA). Aurora can efficiently adapt to new atmospheric prediction tasks by learning a general-purpose representation of atmospheric dynamics.
Aurora outperforms operational CAMS across many targets in forecasting atmospheric chemistry and air pollution. It can produce operational forecasts that match or outperform CAMS forecasts in terms of RMSE on 74% of all targets, at orders of magnitude lower computational cost. Aurora models the five chemical species (CO, NO, NO2, SO2, and O3) both as 3D atmospheric variables and as 2D surface-level variables as their total column values, and models the PMs as surface-level variables.
Aurora also outperforms operational HRES at 0.1° resolution in weather forecasting. It produces forecasts that outperform HRES across the vast majority of targets and is the only AI model to accurately estimate the maximum 10 m wind speed in storm Ciarán. Aurora's forecasts are visually similar to IFS analysis, with the primary difference being the level of detail.
Data diversity and model scaling improve atmospheric forecasting. Pretraining on diverse datasets and increasing model size improve performance. Aurora outperforms GraphCast in forecasting extreme values and in wind speed prediction at the surface. Aurora's performance is consistent with IFS-HRES in terms of RMSE for wind speed and surface temperature, but it outperforms GraphCast in some cases.
Aurora represents a significant step forward in environmental prediction, leveraging the scaling properties of AI foundation models to extract valuable insights from vast amounts of Earth system data. It demonstrates the potential for AI to advance operational weather forecasting and related fields. The implications of Aurora extend far beyond atmospheric forecasting, with potential applications in weather and climate sciences. Foundation models could have profound implications for environmental forecasting in data-sparse regions, such as the developing world and the polar regions. By leveraging the knowledge learned fromAurora is a large-scale foundation model of the atmosphere trained on over a million hours of weather and climate data. It leverages the strengths of foundation modeling to produce operational forecasts for a wide variety of atmospheric prediction problems, including those with limited training data, heterogeneous variables, and extreme events. Aurora produces 5-day global air pollution predictions and 10-day high-resolution weather forecasts that outperform state-of-the-art classical simulation tools and the best specialized deep learning models. These results indicate that foundation models can transform environmental forecasting.
Aurora is a flexible 3D foundation model for high-resolution forecasting of weather and atmospheric processes. It consists of an encoder, processor, and decoder. The model is pretrained on a mixture of six weather and climate datasets, including ERA5, CMCC, IFS-HR, HRES Forecasts, GFS Analysis, and GFS Forecasts. It is then fine-tuned in two stages: short-lead time fine-tuning of the pretrained weights and long-lead time (rollout) fine-tuning using Low Rank Adaptation (LoRA). Aurora can efficiently adapt to new atmospheric prediction tasks by learning a general-purpose representation of atmospheric dynamics.
Aurora outperforms operational CAMS across many targets in forecasting atmospheric chemistry and air pollution. It can produce operational forecasts that match or outperform CAMS forecasts in terms of RMSE on 74% of all targets, at orders of magnitude lower computational cost. Aurora models the five chemical species (CO, NO, NO2, SO2, and O3) both as 3D atmospheric variables and as 2D surface-level variables as their total column values, and models the PMs as surface-level variables.
Aurora also outperforms operational HRES at 0.1° resolution in weather forecasting. It produces forecasts that outperform HRES across the vast majority of targets and is the only AI model to accurately estimate the maximum 10 m wind speed in storm Ciarán. Aurora's forecasts are visually similar to IFS analysis, with the primary difference being the level of detail.
Data diversity and model scaling improve atmospheric forecasting. Pretraining on diverse datasets and increasing model size improve performance. Aurora outperforms GraphCast in forecasting extreme values and in wind speed prediction at the surface. Aurora's performance is consistent with IFS-HRES in terms of RMSE for wind speed and surface temperature, but it outperforms GraphCast in some cases.
Aurora represents a significant step forward in environmental prediction, leveraging the scaling properties of AI foundation models to extract valuable insights from vast amounts of Earth system data. It demonstrates the potential for AI to advance operational weather forecasting and related fields. The implications of Aurora extend far beyond atmospheric forecasting, with potential applications in weather and climate sciences. Foundation models could have profound implications for environmental forecasting in data-sparse regions, such as the developing world and the polar regions. By leveraging the knowledge learned from