2024 | Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris
PirateNets: Physics-Informed Deep Learning with Residual Adaptive Networks
Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris
Abstract: Physics-informed neural networks (PINNs) have become a popular deep learning framework for solving forward and inverse problems governed by partial differential equations (PDEs). However, their performance degrades when using larger and deeper neural networks. This study identifies that the root cause is the use of multi-layer perceptron (MLP) architectures with unsuitable initialization schemes, leading to poor trainability of network derivatives and unstable minimization of PDE residual loss. To address this, we introduce PirateNets, a novel architecture that facilitates stable and efficient training of deep PINN models. PirateNets leverage a novel adaptive residual connection, allowing networks to be initialized as shallow networks that progressively deepen during training. We also show that the proposed initialization scheme encodes appropriate inductive biases corresponding to a given PDE system into the network architecture. Comprehensive empirical evidence shows that PirateNets are easier to optimize and can gain accuracy from increased depth, achieving state-of-the-art results across various benchmarks.
Introduction: Machine learning (ML) is making a significant impact on science and engineering, providing advanced tools for analyzing complex data, uncovering nonlinear relationships, and developing predictive models. Physics-informed machine learning (PIML) integrates physical laws and constraints into ML models, opening new frontiers for traditional scientific research and addressing persistent challenges in ML, such as robustness, interpretability, and generalization. The fundamental question that PIML aims to address is how to incorporate physical prior knowledge into ML models. This can be achieved by modifying key components of the ML pipeline, including data processing, model architecture, loss functions, optimization algorithms, and fine-tuning and inference.
Physics-informed neural networks (PINNs) are a popular method for embedding physical principles in ML. They have been extensively used to solve forward and inverse problems involving PDEs by seamlessly integrating noisy experimental data and physical laws into the learning process. Despite significant progress, most existing works on PINNs tend to employ small, shallow network architectures, leaving the vast potential of deep networks largely untapped. To bridge this gap, we propose a novel class of architectures called Physics-Informed Residual Adaptive Networks (PirateNets). Our main contributions are: (1) we argue that the capacity of PINNs to minimize PDE residuals is determined by the capacity of network derivatives; (2) we support this argument by proving that for second-order linear elliptic and parabolic PDEs, the convergence in training error leads to the convergence in the solution and its derivatives; (3) we empirically and theoretically reveal that conventional initialization schemes result in problematic initialization of MLP derivatives and thus worse trainability; (4) we introduce PirateNets to address the issue of pathological initialization, enabling stable and efficient scaling of PINNs to utilize deep networks. ThePirateNets: Physics-Informed Deep Learning with Residual Adaptive Networks
Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris
Abstract: Physics-informed neural networks (PINNs) have become a popular deep learning framework for solving forward and inverse problems governed by partial differential equations (PDEs). However, their performance degrades when using larger and deeper neural networks. This study identifies that the root cause is the use of multi-layer perceptron (MLP) architectures with unsuitable initialization schemes, leading to poor trainability of network derivatives and unstable minimization of PDE residual loss. To address this, we introduce PirateNets, a novel architecture that facilitates stable and efficient training of deep PINN models. PirateNets leverage a novel adaptive residual connection, allowing networks to be initialized as shallow networks that progressively deepen during training. We also show that the proposed initialization scheme encodes appropriate inductive biases corresponding to a given PDE system into the network architecture. Comprehensive empirical evidence shows that PirateNets are easier to optimize and can gain accuracy from increased depth, achieving state-of-the-art results across various benchmarks.
Introduction: Machine learning (ML) is making a significant impact on science and engineering, providing advanced tools for analyzing complex data, uncovering nonlinear relationships, and developing predictive models. Physics-informed machine learning (PIML) integrates physical laws and constraints into ML models, opening new frontiers for traditional scientific research and addressing persistent challenges in ML, such as robustness, interpretability, and generalization. The fundamental question that PIML aims to address is how to incorporate physical prior knowledge into ML models. This can be achieved by modifying key components of the ML pipeline, including data processing, model architecture, loss functions, optimization algorithms, and fine-tuning and inference.
Physics-informed neural networks (PINNs) are a popular method for embedding physical principles in ML. They have been extensively used to solve forward and inverse problems involving PDEs by seamlessly integrating noisy experimental data and physical laws into the learning process. Despite significant progress, most existing works on PINNs tend to employ small, shallow network architectures, leaving the vast potential of deep networks largely untapped. To bridge this gap, we propose a novel class of architectures called Physics-Informed Residual Adaptive Networks (PirateNets). Our main contributions are: (1) we argue that the capacity of PINNs to minimize PDE residuals is determined by the capacity of network derivatives; (2) we support this argument by proving that for second-order linear elliptic and parabolic PDEs, the convergence in training error leads to the convergence in the solution and its derivatives; (3) we empirically and theoretically reveal that conventional initialization schemes result in problematic initialization of MLP derivatives and thus worse trainability; (4) we introduce PirateNets to address the issue of pathological initialization, enabling stable and efficient scaling of PINNs to utilize deep networks. The