Received: March 4, 2024. Revised: May 20, 2024. Accepted: June 27, 2024 | Bowen Li1,1, Zhen Wang1,2,1, Ziqi Liu1,3, Yanxin Tao1, Chulin Sha1, Min He1,2, Xiaolin Li1,4,*
The paper introduces DrugMetric, an innovative unsupervised learning framework that combines variational autoencoders (VAEs) and Gaussian Mixture Models (GMMs) to assess drug-likeness based on chemical space distance. This framework addresses the limitations of traditional methods like Quantitative Estimation of Drug-likeness (QED) by providing a more accurate and comprehensive evaluation of molecular properties. DrugMetric uses a dataset of potential drug candidates and non-drug datasets to delineate chemical space distributions and assign drug-likeness scores. The framework incorporates ensemble learning techniques to enhance predictive accuracy. Experimental results show that DrugMetric outperforms QED in various tasks, including drug-likeness scoring and classification, with AUC values of 0.83, 0.94, and 0.99 in three classification tasks. Additionally, DrugMetric's scores correlate well with hepatic microsomal stability data, suggesting its potential in predicting drug metabolism and pharmacokinetic properties. The authors have made the code publicly available to facilitate further research and application in drug discovery.The paper introduces DrugMetric, an innovative unsupervised learning framework that combines variational autoencoders (VAEs) and Gaussian Mixture Models (GMMs) to assess drug-likeness based on chemical space distance. This framework addresses the limitations of traditional methods like Quantitative Estimation of Drug-likeness (QED) by providing a more accurate and comprehensive evaluation of molecular properties. DrugMetric uses a dataset of potential drug candidates and non-drug datasets to delineate chemical space distributions and assign drug-likeness scores. The framework incorporates ensemble learning techniques to enhance predictive accuracy. Experimental results show that DrugMetric outperforms QED in various tasks, including drug-likeness scoring and classification, with AUC values of 0.83, 0.94, and 0.99 in three classification tasks. Additionally, DrugMetric's scores correlate well with hepatic microsomal stability data, suggesting its potential in predicting drug metabolism and pharmacokinetic properties. The authors have made the code publicly available to facilitate further research and application in drug discovery.