Machine Learning of Accurate Energy-Conserving Molecular Force Fields

Machine Learning of Accurate Energy-Conserving Molecular Force Fields

May 9, 2017 | Stefan Chmiela, Alexandre Tkatchenko, Huziel E. Sauceda, Igor Poltavsky, Kristof T. Schütt, Klaus-Robert Müller
This paper presents a gradient-domain machine learning (GDML) approach for constructing accurate molecular force fields that conserve energy. The method uses atomic gradient information instead of atomic energies, enabling energy conservation by construction. GDML achieves high accuracy in reproducing global potential energy surfaces (PESs) of intermediate-sized molecules with an error of 0.3 kcal mol⁻¹ for energies and 1 kcal mol⁻¹ Å⁻¹ for forces using only 1000 conformational geometries. This accuracy is demonstrated for molecules such as benzene, toluene, naphthalene, ethanol, uracil, and aspirin. The GDML approach avoids overfitting and artifacts by enforcing energy conservation, leading to efficient and precise molecular dynamics (MD) simulations with PESs derived from high-level quantum-chemical methods. The method is validated using path-integral MD (PIMD) for eight organic molecules with up to 21 atoms and four chemical elements. The GDML model is shown to outperform energy-based models in terms of accuracy and efficiency, requiring significantly fewer samples to achieve similar performance. The model is also capable of capturing long-range interactions and chemical dynamics, making it suitable for predicting statistical averages and fluctuations in MD simulations. The GDML approach combines rigorous physical laws with data-driven machine learning, enabling the construction of complex, multi-dimensional PESs with high accuracy and transferability. The method is scalable and can be extended to larger systems and more complex phenomena, including reaction pathways and intermolecular interactions. The data used in this study are available online, and the work is supported by various funding sources. The results demonstrate the potential of GDML for accurate and efficient molecular simulations, with applications in chemistry, materials science, and beyond.This paper presents a gradient-domain machine learning (GDML) approach for constructing accurate molecular force fields that conserve energy. The method uses atomic gradient information instead of atomic energies, enabling energy conservation by construction. GDML achieves high accuracy in reproducing global potential energy surfaces (PESs) of intermediate-sized molecules with an error of 0.3 kcal mol⁻¹ for energies and 1 kcal mol⁻¹ Å⁻¹ for forces using only 1000 conformational geometries. This accuracy is demonstrated for molecules such as benzene, toluene, naphthalene, ethanol, uracil, and aspirin. The GDML approach avoids overfitting and artifacts by enforcing energy conservation, leading to efficient and precise molecular dynamics (MD) simulations with PESs derived from high-level quantum-chemical methods. The method is validated using path-integral MD (PIMD) for eight organic molecules with up to 21 atoms and four chemical elements. The GDML model is shown to outperform energy-based models in terms of accuracy and efficiency, requiring significantly fewer samples to achieve similar performance. The model is also capable of capturing long-range interactions and chemical dynamics, making it suitable for predicting statistical averages and fluctuations in MD simulations. The GDML approach combines rigorous physical laws with data-driven machine learning, enabling the construction of complex, multi-dimensional PESs with high accuracy and transferability. The method is scalable and can be extended to larger systems and more complex phenomena, including reaction pathways and intermolecular interactions. The data used in this study are available online, and the work is supported by various funding sources. The results demonstrate the potential of GDML for accurate and efficient molecular simulations, with applications in chemistry, materials science, and beyond.
Reach us at info@study.space