Chapter 16: Partial Least Squares
1 Summary
1.1 Background: Traditional statistical tests struggle with large numbers of variables. Add-up scores are a simple method to reduce variables, but they do not account for variable importance, interactions, or unit differences. Principal components analysis (PCA) and partial least squares (PLS) account for these, but are rarely used in clinical trials.
1.2 Objective: To evaluate the performance of PCA and PLS.
1.3 Methods: A simulated example with 250 patients' gene expression data as predictors and drug efficacy scores as outcomes was used. PCA was performed using SPSS's Data Dimension Reduction module, while PLS was performed using R's Partial Least Squares package.
1.4 Results: Of 27 variables, three novel predictors were constructed. PCA showed significant predictors with t-values of 10.2, 21.6, and 6.7 (p < 0.000). PLS also predicted the outcome, with lower significance (t-values of 6.8, 16.2, and 3.5, p < 0.000, p < 0.000, p < 0.001). Traditional multiple linear regression with add-up scores as independent variables showed further reduced significance (t-values of 3.4, 11.2, and 2.4, p < 0.002, p < 0.001, p < 0.02).
1.5 Conclusions: PCA and PLS can handle more variables than standard methods like MANOVA and MANCOVA, and are more sensitive than add-up scores. They account for variable importance, interactions, and unit differences. They are flexible, allowing manifest variables to be used twice. PLS is more parsimonious than PCA as it can include outcome variables in the model.Chapter 16: Partial Least Squares
1 Summary
1.1 Background: Traditional statistical tests struggle with large numbers of variables. Add-up scores are a simple method to reduce variables, but they do not account for variable importance, interactions, or unit differences. Principal components analysis (PCA) and partial least squares (PLS) account for these, but are rarely used in clinical trials.
1.2 Objective: To evaluate the performance of PCA and PLS.
1.3 Methods: A simulated example with 250 patients' gene expression data as predictors and drug efficacy scores as outcomes was used. PCA was performed using SPSS's Data Dimension Reduction module, while PLS was performed using R's Partial Least Squares package.
1.4 Results: Of 27 variables, three novel predictors were constructed. PCA showed significant predictors with t-values of 10.2, 21.6, and 6.7 (p < 0.000). PLS also predicted the outcome, with lower significance (t-values of 6.8, 16.2, and 3.5, p < 0.000, p < 0.000, p < 0.001). Traditional multiple linear regression with add-up scores as independent variables showed further reduced significance (t-values of 3.4, 11.2, and 2.4, p < 0.002, p < 0.001, p < 0.02).
1.5 Conclusions: PCA and PLS can handle more variables than standard methods like MANOVA and MANCOVA, and are more sensitive than add-up scores. They account for variable importance, interactions, and unit differences. They are flexible, allowing manifest variables to be used twice. PLS is more parsimonious than PCA as it can include outcome variables in the model.