SolPredictor: Predicting Solubility with Residual Gated Graph Neural Network

SolPredictor: Predicting Solubility with Residual Gated Graph Neural Network

2024 | Waqar Ahmad, Hilal Tayara, HyunJoo Shim, Kil To Chong
SolPredictor is a computational model designed to predict molecular solubility using a residual gated graph neural network (RGNN). The model leverages graph-structured data to capture long-range dependencies and preserve essential features. It uses simplified molecular-input line-entry system (SMILES) representations and was evaluated on five independent datasets. The model achieved a ten-fold cross-validation Pearson correlation coefficient (R²) of 0.79 ± 0.02 and root mean square error (RMSE) of 1.03 ± 0.04. The RGNN's residual connections enable information flow across layers, allowing the model to effectively capture both local and global dependencies in molecular structures. The model was tested on five datasets, achieving R² values of 0.547, 0.814, 0.805, 0.373, and 0.677, and RMSE values of 0.597, 0.743, 0.783, 0.991, and 1.142, respectively. Error analysis, hyperparameter optimization, and model explainability were used to identify the most important molecular features for prediction. The model's performance was validated using a web server that accepts SMILES input and provides solubility predictions. The model demonstrates high accuracy in predicting solubility, which is crucial for drug discovery, as it influences bioavailability, formulation design, and pharmacokinetic parameters. The study highlights the effectiveness of RGNNs in solubility prediction and their potential for improving drug development processes.SolPredictor is a computational model designed to predict molecular solubility using a residual gated graph neural network (RGNN). The model leverages graph-structured data to capture long-range dependencies and preserve essential features. It uses simplified molecular-input line-entry system (SMILES) representations and was evaluated on five independent datasets. The model achieved a ten-fold cross-validation Pearson correlation coefficient (R²) of 0.79 ± 0.02 and root mean square error (RMSE) of 1.03 ± 0.04. The RGNN's residual connections enable information flow across layers, allowing the model to effectively capture both local and global dependencies in molecular structures. The model was tested on five datasets, achieving R² values of 0.547, 0.814, 0.805, 0.373, and 0.677, and RMSE values of 0.597, 0.743, 0.783, 0.991, and 1.142, respectively. Error analysis, hyperparameter optimization, and model explainability were used to identify the most important molecular features for prediction. The model's performance was validated using a web server that accepts SMILES input and provides solubility predictions. The model demonstrates high accuracy in predicting solubility, which is crucial for drug discovery, as it influences bioavailability, formulation design, and pharmacokinetic parameters. The study highlights the effectiveness of RGNNs in solubility prediction and their potential for improving drug development processes.
Reach us at info@study.space
[slides and audio] SolPredictor%3A Predicting Solubility with Residual Gated Graph Neural Network