This paper explores the challenges in training Physics-Informed Neural Networks (PINNs), focusing on the role of the loss landscape in the training process. The study highlights the difficulties in minimizing the PINN loss function, particularly due to ill-conditioning caused by differential operators in the residual term. The authors compare gradient-based optimizers Adam, L-BFGS, and their combination Adam+L-BFGS, showing that Adam+L-BFGS performs better. A novel second-order optimizer, NysNewton-CG (NNCG), is introduced, which significantly improves PINN performance. Theoretically, the work connects ill-conditioned differential operators to ill-conditioning in the PINN loss and demonstrates the benefits of combining first- and second-order optimization methods. The paper presents valuable insights and more powerful optimization strategies for training PINNs, which could improve their utility for solving difficult partial differential equations. The study shows that the PINN loss is ill-conditioned, and that quasi-Newton methods like L-BFGS improve the conditioning of the problem. The authors also demonstrate that Adam+L-BFGS consistently provides a smaller loss and L2RE than using Adam or L-BFGS alone. Furthermore, the paper shows that the loss is often under-optimized, and that a damped version of Newton's method can further improve the loss and gradient norm. Theoretical analysis supports these findings, showing that ill-conditioned differential operators lead to challenging optimization problems. The paper concludes that combining first- and second-order methods is crucial for effective PINN training.This paper explores the challenges in training Physics-Informed Neural Networks (PINNs), focusing on the role of the loss landscape in the training process. The study highlights the difficulties in minimizing the PINN loss function, particularly due to ill-conditioning caused by differential operators in the residual term. The authors compare gradient-based optimizers Adam, L-BFGS, and their combination Adam+L-BFGS, showing that Adam+L-BFGS performs better. A novel second-order optimizer, NysNewton-CG (NNCG), is introduced, which significantly improves PINN performance. Theoretically, the work connects ill-conditioned differential operators to ill-conditioning in the PINN loss and demonstrates the benefits of combining first- and second-order optimization methods. The paper presents valuable insights and more powerful optimization strategies for training PINNs, which could improve their utility for solving difficult partial differential equations. The study shows that the PINN loss is ill-conditioned, and that quasi-Newton methods like L-BFGS improve the conditioning of the problem. The authors also demonstrate that Adam+L-BFGS consistently provides a smaller loss and L2RE than using Adam or L-BFGS alone. Furthermore, the paper shows that the loss is often under-optimized, and that a damped version of Newton's method can further improve the loss and gradient norm. Theoretical analysis supports these findings, showing that ill-conditioned differential operators lead to challenging optimization problems. The paper concludes that combining first- and second-order methods is crucial for effective PINN training.