Deep Generative Data Assimilation in Multimodal Setting

Deep Generative Data Assimilation in Multimodal Setting

2024 | Yongquan Qu*, Juan Nathaniel*, Shuolin Li, Pierre Gentine
This paper introduces SLAMS, a score-based latent data assimilation framework for multimodal data. The framework enables the calibration of model states using multimodal observations, including in-situ weather station data and ex-situ satellite imagery. SLAMS operates within a unified latent space, allowing for the projection of heterogeneous, multimodal datasets into a common latent subspace alongside the target states. This approach eliminates the need for a complex observation operator, which is commonly used in traditional data assimilation methods. The probabilistic foundation of SLAMS allows for the generation of an ensemble of analysis states, facilitating uncertainty quantification. The framework is trained using a score-based latent diffusion model, which is adapted to calibrate real-world weather states using multimodal observations. The paper demonstrates that SLAMS is robust even in low-resolution, noisy, and sparse data settings. The framework is evaluated on real-world datasets, showing improved performance compared to pixel-based approaches, particularly in scenarios with low-resolution, noisy, and sparse inputs. The results indicate that SLAMS is more physically consistent and stable, especially in high-latitude regions and for tropical areas. The study also highlights the importance of multimodal observations in improving the accuracy of analysis states, particularly for top-of-atmosphere variables. The framework is adaptable to future integration with data modalities traditionally challenging to represent as image frames, such as point clouds, textual, and tabular data. The paper concludes that SLAMS provides a robust and efficient approach to data assimilation in multimodal settings, with potential applications in Earth system modeling and other computational science domains.This paper introduces SLAMS, a score-based latent data assimilation framework for multimodal data. The framework enables the calibration of model states using multimodal observations, including in-situ weather station data and ex-situ satellite imagery. SLAMS operates within a unified latent space, allowing for the projection of heterogeneous, multimodal datasets into a common latent subspace alongside the target states. This approach eliminates the need for a complex observation operator, which is commonly used in traditional data assimilation methods. The probabilistic foundation of SLAMS allows for the generation of an ensemble of analysis states, facilitating uncertainty quantification. The framework is trained using a score-based latent diffusion model, which is adapted to calibrate real-world weather states using multimodal observations. The paper demonstrates that SLAMS is robust even in low-resolution, noisy, and sparse data settings. The framework is evaluated on real-world datasets, showing improved performance compared to pixel-based approaches, particularly in scenarios with low-resolution, noisy, and sparse inputs. The results indicate that SLAMS is more physically consistent and stable, especially in high-latitude regions and for tropical areas. The study also highlights the importance of multimodal observations in improving the accuracy of analysis states, particularly for top-of-atmosphere variables. The framework is adaptable to future integration with data modalities traditionally challenging to represent as image frames, such as point clouds, textual, and tabular data. The paper concludes that SLAMS provides a robust and efficient approach to data assimilation in multimodal settings, with potential applications in Earth system modeling and other computational science domains.
Reach us at info@study.space
[slides] Deep Generative Data Assimilation in Multimodal Setting | StudySpace