August 29, 2000 | Orly Alter*, Patrick O. Brown*, and David Botstein*
Singular value decomposition (SVD) is used to transform genome-wide expression data from genes × arrays space to reduced diagonalized "eigengenes" × "eigenarrays" space, where eigengenes and eigenarrays are unique orthonormal superpositions of genes and arrays. Normalizing the data by filtering out eigengenes and eigenarrays inferred to represent noise or experimental artifacts enables meaningful comparison of gene expression across different arrays and experiments. Sorting data according to eigengenes and eigenarrays provides a global view of gene expression dynamics, classifying genes and arrays into groups of similar regulation, function, or cellular state. Significant eigengenes and eigenarrays can be associated with regulators' effects or measured samples. SVD is a linear transformation that diagonalizes data, decoupling eigengenes and eigenarrays, and allows data normalization and sorting. The fraction of eigenexpression indicates the significance of each eigengene or eigenarray. The Shannon entropy measures data complexity. SVD provides a mathematical framework for processing and modeling genome-wide expression data, assigning biological meaning to mathematical variables and operations. In the elutriation-synchronized cell cycle study, SVD identified eigengenes representing cell cycle expression oscillations. After normalization, eigengenes and eigenarrays were associated with cell cycle stages and regulatory processes. In the α factor-synchronized cell cycle and CLB2/CLN3 overactivation studies, SVD identified eigengenes and eigenarrays associated with specific cell cycle transitions and regulatory effects. SVD provides a useful framework for analyzing genome-wide expression data, enabling the identification of regulatory programs and cellular states.Singular value decomposition (SVD) is used to transform genome-wide expression data from genes × arrays space to reduced diagonalized "eigengenes" × "eigenarrays" space, where eigengenes and eigenarrays are unique orthonormal superpositions of genes and arrays. Normalizing the data by filtering out eigengenes and eigenarrays inferred to represent noise or experimental artifacts enables meaningful comparison of gene expression across different arrays and experiments. Sorting data according to eigengenes and eigenarrays provides a global view of gene expression dynamics, classifying genes and arrays into groups of similar regulation, function, or cellular state. Significant eigengenes and eigenarrays can be associated with regulators' effects or measured samples. SVD is a linear transformation that diagonalizes data, decoupling eigengenes and eigenarrays, and allows data normalization and sorting. The fraction of eigenexpression indicates the significance of each eigengene or eigenarray. The Shannon entropy measures data complexity. SVD provides a mathematical framework for processing and modeling genome-wide expression data, assigning biological meaning to mathematical variables and operations. In the elutriation-synchronized cell cycle study, SVD identified eigengenes representing cell cycle expression oscillations. After normalization, eigengenes and eigenarrays were associated with cell cycle stages and regulatory processes. In the α factor-synchronized cell cycle and CLB2/CLN3 overactivation studies, SVD identified eigengenes and eigenarrays associated with specific cell cycle transitions and regulatory effects. SVD provides a useful framework for analyzing genome-wide expression data, enabling the identification of regulatory programs and cellular states.