Exploring Expression Data: Identification and Analysis of Coexpressed Genes

Exploring Expression Data: Identification and Analysis of Coexpressed Genes

1999 | Laurie J. Heyer, Semyon Kruglyak, and Shibu Yooseph
The article presents a systematic approach to analyzing gene expression data, focusing on yeast cell cycle data. The authors describe a set of analytical tools and their application to identify and analyze coexpressed genes. The key components of their approach include: 1. **Similarity Measure**: A new measure called jackknife correlation is introduced to reduce false positives by robustly handling outliers. 2. **Clustering Algorithm**: A specialized clustering algorithm, QT_Clust, is developed to find large clusters with a quality guarantee, ensuring that all members within a cluster are coexpressed. 3. **Interactive Graphical Analysis**: An interactive tool allows users to validate and refine clusters, providing a more detailed analysis of specific gene patterns. The study uses the 17-time-point mitotic cell cycle data from yeast to demonstrate these methods. The authors filter the data to remove unreliable entries and scale the expression levels to have mean zero and variance one. They then apply the jackknife correlation measure to identify coexpressed genes and use the QT_Clust algorithm to form clusters. The clusters are further analyzed using an interactive approach, which allows users to build and refine clusters based on specific gene patterns or interests. The article concludes by highlighting the advantages of their methods, including the ability to handle outliers, the robustness of the clustering algorithm, and the interactive nature of the analysis. These tools can be applied to other gene expression datasets to uncover more information about gene function and regulation.The article presents a systematic approach to analyzing gene expression data, focusing on yeast cell cycle data. The authors describe a set of analytical tools and their application to identify and analyze coexpressed genes. The key components of their approach include: 1. **Similarity Measure**: A new measure called jackknife correlation is introduced to reduce false positives by robustly handling outliers. 2. **Clustering Algorithm**: A specialized clustering algorithm, QT_Clust, is developed to find large clusters with a quality guarantee, ensuring that all members within a cluster are coexpressed. 3. **Interactive Graphical Analysis**: An interactive tool allows users to validate and refine clusters, providing a more detailed analysis of specific gene patterns. The study uses the 17-time-point mitotic cell cycle data from yeast to demonstrate these methods. The authors filter the data to remove unreliable entries and scale the expression levels to have mean zero and variance one. They then apply the jackknife correlation measure to identify coexpressed genes and use the QT_Clust algorithm to form clusters. The clusters are further analyzed using an interactive approach, which allows users to build and refine clusters based on specific gene patterns or interests. The article concludes by highlighting the advantages of their methods, including the ability to handle outliers, the robustness of the clustering algorithm, and the interactive nature of the analysis. These tools can be applied to other gene expression datasets to uncover more information about gene function and regulation.
Reach us at info@study.space