2024 | Alex K. Chew¹, Matthew Sender², Zachary Kaplan¹, Anand Chandrasekaran¹, Jackson Chief Elk², Andrea R. Browning², H. Shaun Kwak², Mathew D. Halls³ and Mohammad Atif Faiz Afzal²*
This study presents a novel approach to predict the temperature-dependent viscosity of small organic molecules using physics-informed machine learning (ML) models. The research integrates molecular dynamics (MD) simulations to enhance the accuracy and interpretability of ML models. A comprehensive dataset of over 4000 experimental viscosities of small organic molecules was curated from scientific literature and online databases. This dataset was used to develop quantitative structure–property relationships (QSPR) models, including descriptor-based and graph neural network (GNN) models, to predict temperature-dependent viscosities. The QSPR models revealed that including MD descriptors significantly improves the prediction of experimental viscosities, particularly for small datasets. Feature importance analysis showed that intermolecular interactions captured by MD descriptors are most important for viscosity predictions. The QSPR models accurately captured the inverse relationship between viscosity and temperature for six battery-relevant solvents. The study highlights the effectiveness of incorporating MD descriptors into QSPR models, which leads to improved accuracy for properties that are difficult to predict using physics-based models alone or when limited data is available. The results demonstrate that MD descriptors are particularly useful for viscosity predictions at low data scales, but their usefulness diminishes at high data scales. The study also shows that descriptor-based models, such as LGBM, and graph-based models, such as EdgePool, can accurately predict temperature-dependent viscosities. The inclusion of MD descriptors slightly improves the accuracy of QSPR models compared to using two-dimensional descriptors alone. The study contributes to the field of materials science by providing a robust framework for predicting material properties using physics-informed ML models.This study presents a novel approach to predict the temperature-dependent viscosity of small organic molecules using physics-informed machine learning (ML) models. The research integrates molecular dynamics (MD) simulations to enhance the accuracy and interpretability of ML models. A comprehensive dataset of over 4000 experimental viscosities of small organic molecules was curated from scientific literature and online databases. This dataset was used to develop quantitative structure–property relationships (QSPR) models, including descriptor-based and graph neural network (GNN) models, to predict temperature-dependent viscosities. The QSPR models revealed that including MD descriptors significantly improves the prediction of experimental viscosities, particularly for small datasets. Feature importance analysis showed that intermolecular interactions captured by MD descriptors are most important for viscosity predictions. The QSPR models accurately captured the inverse relationship between viscosity and temperature for six battery-relevant solvents. The study highlights the effectiveness of incorporating MD descriptors into QSPR models, which leads to improved accuracy for properties that are difficult to predict using physics-based models alone or when limited data is available. The results demonstrate that MD descriptors are particularly useful for viscosity predictions at low data scales, but their usefulness diminishes at high data scales. The study also shows that descriptor-based models, such as LGBM, and graph-based models, such as EdgePool, can accurately predict temperature-dependent viscosities. The inclusion of MD descriptors slightly improves the accuracy of QSPR models compared to using two-dimensional descriptors alone. The study contributes to the field of materials science by providing a robust framework for predicting material properties using physics-informed ML models.