2024 | Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, Jun Zhu
This paper introduces DPOT, an auto-regressive denoising operator transformer for large-scale PDE pre-training. The authors propose a new pre-training strategy that allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. They also design a flexible and scalable model architecture based on Fourier attention, enabling easy scaling for large-scale pre-training. The model is trained on up to 1B parameters using data from 10+ PDE datasets with over 100k trajectories. Extensive experiments show that DPOT achieves state-of-the-art performance on benchmarks and validates strong generalizability for diverse downstream PDE tasks. The model's architecture is based on Fourier attention, which enables efficient learning of kernel integral transforms for PDE solution maps. The authors also conduct ablation studies and scaling experiments, showing that DPOT scales well and achieves better performance than other methods. The results demonstrate that DPOT is a promising approach for large-scale PDE pre-training, with potential applications in various scientific domains.This paper introduces DPOT, an auto-regressive denoising operator transformer for large-scale PDE pre-training. The authors propose a new pre-training strategy that allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. They also design a flexible and scalable model architecture based on Fourier attention, enabling easy scaling for large-scale pre-training. The model is trained on up to 1B parameters using data from 10+ PDE datasets with over 100k trajectories. Extensive experiments show that DPOT achieves state-of-the-art performance on benchmarks and validates strong generalizability for diverse downstream PDE tasks. The model's architecture is based on Fourier attention, which enables efficient learning of kernel integral transforms for PDE solution maps. The authors also conduct ablation studies and scaling experiments, showing that DPOT scales well and achieves better performance than other methods. The results demonstrate that DPOT is a promising approach for large-scale PDE pre-training, with potential applications in various scientific domains.