2003 | Eriksson, L, Jaworska, J, Worth, AP, Cronin, MTD, McDowell, RM and Gramatica, P
This article discusses methods for assessing the reliability, uncertainty, and applicability of classification- and regression-based quantitative structure-activity relationship (QSAR) models. It emphasizes the importance of defining the applicability domain of QSAR models and estimating parameter and prediction uncertainty. The article also discusses QSAR acceptability criteria and highlights the need for rigorous and independent validation of QSAR models for regulatory acceptance.
QSAR models are mathematical models that approximate the complex relationships between chemical properties and biological activities. They are used to predict the biological activity of untested compounds and to identify chemical properties that influence biological activity. QSAR models can be classified into regression-based models, such as multiple linear regression (MLR) and partial least squares (PLS), and classification-based models, such as discriminant analysis and decision trees.
The article discusses the importance of data preprocessing techniques, such as scaling and centering, to ensure that all variables have equal influence on the model. It also highlights the use of diagnostic tools, such as normal probability plots, to identify outliers and assess model performance. The article emphasizes the need for a representative training set that spans the chemical domain of interest and includes a broad and stable set of descriptors.
The article also discusses the importance of considering the biological response variable and the chemical descriptors used in QSAR modeling. It highlights the need for reliable and high-quality biological data, as well as the importance of understanding the range of validity of the QSAR model. The article also discusses the use of multivariate projection methods, such as principal component analysis (PCA), and other approaches in QSAR modeling.
The article concludes with a discussion on the importance of predictive validation, including external validation and cross-validation, to assess the predictive power of QSAR models. It emphasizes the need for rigorous validation methods to ensure the reliability and accuracy of QSAR models for regulatory acceptance. The article also highlights the importance of using appropriate statistical methods and diagnostic tools to assess the reliability and uncertainty of QSAR models.This article discusses methods for assessing the reliability, uncertainty, and applicability of classification- and regression-based quantitative structure-activity relationship (QSAR) models. It emphasizes the importance of defining the applicability domain of QSAR models and estimating parameter and prediction uncertainty. The article also discusses QSAR acceptability criteria and highlights the need for rigorous and independent validation of QSAR models for regulatory acceptance.
QSAR models are mathematical models that approximate the complex relationships between chemical properties and biological activities. They are used to predict the biological activity of untested compounds and to identify chemical properties that influence biological activity. QSAR models can be classified into regression-based models, such as multiple linear regression (MLR) and partial least squares (PLS), and classification-based models, such as discriminant analysis and decision trees.
The article discusses the importance of data preprocessing techniques, such as scaling and centering, to ensure that all variables have equal influence on the model. It also highlights the use of diagnostic tools, such as normal probability plots, to identify outliers and assess model performance. The article emphasizes the need for a representative training set that spans the chemical domain of interest and includes a broad and stable set of descriptors.
The article also discusses the importance of considering the biological response variable and the chemical descriptors used in QSAR modeling. It highlights the need for reliable and high-quality biological data, as well as the importance of understanding the range of validity of the QSAR model. The article also discusses the use of multivariate projection methods, such as principal component analysis (PCA), and other approaches in QSAR modeling.
The article concludes with a discussion on the importance of predictive validation, including external validation and cross-validation, to assess the predictive power of QSAR models. It emphasizes the need for rigorous validation methods to ensure the reliability and accuracy of QSAR models for regulatory acceptance. The article also highlights the importance of using appropriate statistical methods and diagnostic tools to assess the reliability and uncertainty of QSAR models.