Deep Neural Nets as a Method for Quantitative Structure–Activity Relationships

Deep Neural Nets as a Method for Quantitative Structure–Activity Relationships

December 17, 2014 | Junshui Ma, Robert P. Sheridan, Andy Liaw, George E. Dahl, and Vladimir Svetnik
Deep neural networks (DNNs) have shown superior performance in quantitative structure-activity relationship (QSAR) predictions compared to random forests (RF). This study demonstrates that DNNs can consistently outperform RF on a diverse set of large QSAR data sets from Merck's drug discovery efforts. While DNNs require a large number of adjustable parameters, the results indicate that a single set of parameters can achieve better performance than RF for most data sets. The study also shows that DNNs can be trained efficiently using graphical processing units (GPUs), despite their computational intensity. The paper evaluates DNNs against RF using 15 QSAR data sets, with results showing that DNNs outperform RF in 11 out of 15 data sets. The study further investigates the impact of various DNN parameters, including network architecture, activation functions, and dropout rates. It finds that ReLU activation functions generally outperform sigmoid functions, and that DNNs with four hidden layers perform best. The study also explores the use of joint DNNs trained on multiple data sets, finding that they can improve predictive performance for some tasks, though not all. The paper recommends a specific set of DNN parameters for use in industrial drug discovery environments, including logarithmic data preprocessing, four hidden layers with specific neuron counts, and specific dropout rates. These parameters were tested on both the Kaggle data sets and additional in-house data sets, showing consistent improvements over RF. The study concludes that DNNs are a practical and effective method for QSAR predictions, particularly when combined with appropriate parameter settings and GPU computing resources.Deep neural networks (DNNs) have shown superior performance in quantitative structure-activity relationship (QSAR) predictions compared to random forests (RF). This study demonstrates that DNNs can consistently outperform RF on a diverse set of large QSAR data sets from Merck's drug discovery efforts. While DNNs require a large number of adjustable parameters, the results indicate that a single set of parameters can achieve better performance than RF for most data sets. The study also shows that DNNs can be trained efficiently using graphical processing units (GPUs), despite their computational intensity. The paper evaluates DNNs against RF using 15 QSAR data sets, with results showing that DNNs outperform RF in 11 out of 15 data sets. The study further investigates the impact of various DNN parameters, including network architecture, activation functions, and dropout rates. It finds that ReLU activation functions generally outperform sigmoid functions, and that DNNs with four hidden layers perform best. The study also explores the use of joint DNNs trained on multiple data sets, finding that they can improve predictive performance for some tasks, though not all. The paper recommends a specific set of DNN parameters for use in industrial drug discovery environments, including logarithmic data preprocessing, four hidden layers with specific neuron counts, and specific dropout rates. These parameters were tested on both the Kaggle data sets and additional in-house data sets, showing consistent improvements over RF. The study concludes that DNNs are a practical and effective method for QSAR predictions, particularly when combined with appropriate parameter settings and GPU computing resources.
Reach us at info@study.space
[slides] Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships | StudySpace