The paper introduces a novel method called Coupled Two-Way Clustering (CTWC) for analyzing gene microarray data. CTWC aims to identify subsets of genes and samples that, when used together, produce stable and significant partitions. The method is particularly useful for gene microarray data, where various biological mechanisms contribute to gene expression levels, making it challenging to extract meaningful patterns. The authors present an iterative clustering algorithm to search for such subsets, which they apply to two gene microarray datasets: one on colon cancer and another on leukemia.
The CTWC algorithm starts by clustering the full dataset and then identifies stable clusters of either samples or genes. These clusters are used as feature sets to cluster the remaining data, generating new stable clusters. The process continues iteratively until no new relevant information is generated. The output of CTWC includes a list of stable gene and sample clusters, along with the relationships between them.
The authors demonstrate the effectiveness of CTWC by applying it to two gene microarray datasets. For the leukemia dataset, CTWC identifies gene clusters that can distinguish between acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) patients, as well as sub-classify ALL patients based on treatment outcomes. For the colon cancer dataset, CTWC reveals a stable partition of samples into two clusters, reflecting different experimental protocols, and identifies conditionally correlated genes related to cell growth and epithelial cells.
The paper highlights the strengths of CTWC in uncovering novel partitions and correlations that are not apparent in unsupervised clustering. It also discusses the biological interpretations of some of the identified partitions, such as the connection between T-cell-related genes and the sub-classification of ALL patients, and the role of cell growth genes in colon cancer. The authors conclude that CTWC is a powerful tool for gene microarray data analysis, capable of revealing new insights into cellular processes and biological mechanisms.The paper introduces a novel method called Coupled Two-Way Clustering (CTWC) for analyzing gene microarray data. CTWC aims to identify subsets of genes and samples that, when used together, produce stable and significant partitions. The method is particularly useful for gene microarray data, where various biological mechanisms contribute to gene expression levels, making it challenging to extract meaningful patterns. The authors present an iterative clustering algorithm to search for such subsets, which they apply to two gene microarray datasets: one on colon cancer and another on leukemia.
The CTWC algorithm starts by clustering the full dataset and then identifies stable clusters of either samples or genes. These clusters are used as feature sets to cluster the remaining data, generating new stable clusters. The process continues iteratively until no new relevant information is generated. The output of CTWC includes a list of stable gene and sample clusters, along with the relationships between them.
The authors demonstrate the effectiveness of CTWC by applying it to two gene microarray datasets. For the leukemia dataset, CTWC identifies gene clusters that can distinguish between acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) patients, as well as sub-classify ALL patients based on treatment outcomes. For the colon cancer dataset, CTWC reveals a stable partition of samples into two clusters, reflecting different experimental protocols, and identifies conditionally correlated genes related to cell growth and epithelial cells.
The paper highlights the strengths of CTWC in uncovering novel partitions and correlations that are not apparent in unsupervised clustering. It also discusses the biological interpretations of some of the identified partitions, such as the connection between T-cell-related genes and the sub-classification of ALL patients, and the role of cell growth genes in colon cancer. The authors conclude that CTWC is a powerful tool for gene microarray data analysis, capable of revealing new insights into cellular processes and biological mechanisms.