Advance Access publication April 28, 2010 | Matthew D. Wilkerson, D. Neil Hayes
ConsensusClusterPlus is an open-source software tool implemented in R, designed for unsupervised class discovery in cancer research. It extends the consensus clustering (CC) method by adding new functionalities and visualizations, such as item tracking, item-consensus, and cluster-consensus plots. These features provide detailed information to help users make more informed decisions about the number of classes and their memberships. The software takes a data matrix as input and outputs stability evidence for a given number of groups ($k$) and cluster assignments. The algorithm subsamples items and features, clusters them using specified methods, and calculates consensus values to identify robust clusters. The output includes graphical plots that help visualize cluster boundaries, assess cluster stability, and identify promiscuous items. An example application using lung cancer gene expression microarrays demonstrates the tool's effectiveness in rediscovering known classes and selecting representative samples for further analysis.ConsensusClusterPlus is an open-source software tool implemented in R, designed for unsupervised class discovery in cancer research. It extends the consensus clustering (CC) method by adding new functionalities and visualizations, such as item tracking, item-consensus, and cluster-consensus plots. These features provide detailed information to help users make more informed decisions about the number of classes and their memberships. The software takes a data matrix as input and outputs stability evidence for a given number of groups ($k$) and cluster assignments. The algorithm subsamples items and features, clusters them using specified methods, and calculates consensus values to identify robust clusters. The output includes graphical plots that help visualize cluster boundaries, assess cluster stability, and identify promiscuous items. An example application using lung cancer gene expression microarrays demonstrates the tool's effectiveness in rediscovering known classes and selecting representative samples for further analysis.