(2024) 16:31 | Alex K. Chew1, Matthew Sender2, Zachary Kaplan1, Anand Chandrasekaran1, Jackson Chief Elk2, Andrea R. Browning2, H. Shaun Kwak2, Mathew D. Halls3 and Mohammad Atif Faiz Afzal2*
This study addresses the challenge of accurately predicting material properties, particularly viscosity, using physics-informed machine learning models. The authors curated a comprehensive dataset of over 4000 small organic molecules' viscosities and developed quantitative structure-property relationships (QSPR) models to predict temperature-dependent viscosities. Both descriptor-based and graph neural network (GNN) models were evaluated, with the light gradient-boosting machine (LGBM) and EdgePool algorithms performing best, respectively. The inclusion of molecular dynamics (MD) descriptors, which capture intermolecular interactions, slightly improved model accuracy, especially in small datasets. Feature importance analysis revealed that MD-derived heat of vaporization was the most significant descriptor for viscosity prediction. The models accurately captured the inverse relationship between viscosity and temperature for six battery-relevant solvents, demonstrating their utility in high-throughput screening for material design. The study highlights the effectiveness of incorporating MD descriptors into QSPR models, enhancing prediction accuracy for properties that are difficult to predict using physics-based models alone or with limited data.This study addresses the challenge of accurately predicting material properties, particularly viscosity, using physics-informed machine learning models. The authors curated a comprehensive dataset of over 4000 small organic molecules' viscosities and developed quantitative structure-property relationships (QSPR) models to predict temperature-dependent viscosities. Both descriptor-based and graph neural network (GNN) models were evaluated, with the light gradient-boosting machine (LGBM) and EdgePool algorithms performing best, respectively. The inclusion of molecular dynamics (MD) descriptors, which capture intermolecular interactions, slightly improved model accuracy, especially in small datasets. Feature importance analysis revealed that MD-derived heat of vaporization was the most significant descriptor for viscosity prediction. The models accurately captured the inverse relationship between viscosity and temperature for six battery-relevant solvents, demonstrating their utility in high-throughput screening for material design. The study highlights the effectiveness of incorporating MD descriptors into QSPR models, enhancing prediction accuracy for properties that are difficult to predict using physics-based models alone or with limited data.