[slides and audio] Coupled two-way clustering analysis of gene microarray data.

This paper presents a novel coupled two-way clustering approach for analyzing gene microarray data. The method, called Coupled Two-Way Clustering (CTWC), identifies subsets of genes and samples such that clustering one subset using the other produces stable and significant partitions. This approach is particularly suitable for gene microarray data, where multiple biological mechanisms influence gene expression levels. The CTWC method was applied to two gene microarray datasets: one from a colon cancer experiment and another from a leukemia experiment. By identifying relevant subsets of the data and focusing on them, the method revealed partitions and correlations that were previously hidden when using the full dataset. Some of these partitions have clear biological interpretations, while others may suggest new research directions. The CTWC algorithm uses an iterative clustering process to find pairs of subsets of genes and samples that produce stable clusters. It can be applied with any clustering algorithm, but the paper focuses on the super-paramagnetic clustering (SPC) algorithm, which is robust against noise and effective for gene microarray data. The algorithm was applied to the leukemia dataset, where it identified gene clusters that could distinguish between acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) patients. It also identified sub-partitions of AML patients based on treatment outcomes and discovered that certain gene clusters could separate T-ALL and B-ALL patients. In the colon cancer dataset, CTWC identified gene clusters that could distinguish between normal and tumor samples and revealed sub-partitions based on experimental protocols. The CTWC method provides a broad list of stable gene and sample clusters, along with connections between them. This information can be used to identify cellular processes, establish connections between gene groups and biological processes, and find partitions of known classes of samples into sub-groups. The method is applicable with any clustering algorithm that can identify stable clusters. The results demonstrate the effectiveness of CTWC in analyzing gene microarray data and suggest that it may be useful for other types of data analysis as well.This paper presents a novel coupled two-way clustering approach for analyzing gene microarray data. The method, called Coupled Two-Way Clustering (CTWC), identifies subsets of genes and samples such that clustering one subset using the other produces stable and significant partitions. This approach is particularly suitable for gene microarray data, where multiple biological mechanisms influence gene expression levels. The CTWC method was applied to two gene microarray datasets: one from a colon cancer experiment and another from a leukemia experiment. By identifying relevant subsets of the data and focusing on them, the method revealed partitions and correlations that were previously hidden when using the full dataset. Some of these partitions have clear biological interpretations, while others may suggest new research directions. The CTWC algorithm uses an iterative clustering process to find pairs of subsets of genes and samples that produce stable clusters. It can be applied with any clustering algorithm, but the paper focuses on the super-paramagnetic clustering (SPC) algorithm, which is robust against noise and effective for gene microarray data. The algorithm was applied to the leukemia dataset, where it identified gene clusters that could distinguish between acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) patients. It also identified sub-partitions of AML patients based on treatment outcomes and discovered that certain gene clusters could separate T-ALL and B-ALL patients. In the colon cancer dataset, CTWC identified gene clusters that could distinguish between normal and tumor samples and revealed sub-partitions based on experimental protocols. The CTWC method provides a broad list of stable gene and sample clusters, along with connections between them. This information can be used to identify cellular processes, establish connections between gene groups and biological processes, and find partitions of known classes of samples into sub-groups. The method is applicable with any clustering algorithm that can identify stable clusters. The results demonstrate the effectiveness of CTWC in analyzing gene microarray data and suggest that it may be useful for other types of data analysis as well.

Coupled Two-Way Clustering Analysis of Gene Microarray Data

August 2, 2018 | G. Getz, E. Levine and E. Domany