March 2012 | Volume 7 | Issue 3 | e31929 | Won-Min Song, T. Di Matteo, Tomaso Aste
The paper introduces a novel graph-theoretic approach, the DBHT technique, to extract clusters and hierarchies in complex datasets without the need for prior information or expert supervision. The method involves building topologically embedded networks from a similarity measure, which are then analyzed to identify both intra-cluster and inter-cluster hierarchies. The technique is demonstrated to outperform other established methods, such as k-means++, spectral clustering, and Q-cut, in various artificial and real datasets. Specifically, the DBHT technique is applied to gene expression data from lymphoma samples, revealing biologically significant groups of genes that play key roles in diagnosis, prognosis, and treatment. The results show that the DBHT technique can accurately differentiate between different cancer subtypes and identify genes with significant regulatory functions, providing a new perspective on the classification and understanding of lymphoid malignancies.The paper introduces a novel graph-theoretic approach, the DBHT technique, to extract clusters and hierarchies in complex datasets without the need for prior information or expert supervision. The method involves building topologically embedded networks from a similarity measure, which are then analyzed to identify both intra-cluster and inter-cluster hierarchies. The technique is demonstrated to outperform other established methods, such as k-means++, spectral clustering, and Q-cut, in various artificial and real datasets. Specifically, the DBHT technique is applied to gene expression data from lymphoma samples, revealing biologically significant groups of genes that play key roles in diagnosis, prognosis, and treatment. The results show that the DBHT technique can accurately differentiate between different cancer subtypes and identify genes with significant regulatory functions, providing a new perspective on the classification and understanding of lymphoid malignancies.