Detecting overlapping protein complexes in protein-protein interaction networks

Detecting overlapping protein complexes in protein-protein interaction networks

2012 | Tamás Nepusz¹, Haiyuan Yu², and Alberto Paccanaro¹
ClusterONE is a method for detecting overlapping protein complexes from protein-protein interaction (PPI) data. It outperforms seven popular methods in identifying protein complexes in yeast data sets, showing better correspondence with reference complexes and higher functional homogeneity. PPI data can be represented as undirected graphs, with nodes as proteins and edges as interactions. Edge weights indicate interaction reliability. Identifying protein complexes involves detecting dense regions with many connections or high-weight edges. Standard clustering methods are not ideal for PPI networks because proteins may belong to multiple complexes. ClusterONE addresses this by using a cohesiveness measure that considers both internal and boundary weights of a group. Cohesiveness is defined as the ratio of internal to total weight (including a penalty term for uncertainty). The algorithm grows groups from seed proteins, iteratively adding or removing vertices to maximize cohesiveness. Groups are then merged based on overlap scores and filtered by size and density. ClusterONE was tested on five yeast PPI data sets, including weighted and unweighted networks. It outperformed other methods in matching reference complexes and provided better one-to-one mappings. MCL was the closest in performance but could not handle overlaps. ClusterONE also showed higher co-localization and overrepresentation scores compared to MCL, indicating better biological relevance. The method was evaluated using a gold standard of reference complexes from the MIPS catalog and SGD. The maximum matching ratio (MMR) was used to assess performance, with higher values indicating better accuracy. ClusterONE also used co-localization scores and overrepresentation analysis to assess biological relevance. ClusterONE is implemented as a Java application available for free. It can be used in standalone mode or as a plugin for Cytoscape and ProCope. The algorithm allows users to detect protein complexes from a set of seed proteins and provides feedback on complex quality during refinement. The implementation is efficient and user-friendly, making it accessible for the scientific community.ClusterONE is a method for detecting overlapping protein complexes from protein-protein interaction (PPI) data. It outperforms seven popular methods in identifying protein complexes in yeast data sets, showing better correspondence with reference complexes and higher functional homogeneity. PPI data can be represented as undirected graphs, with nodes as proteins and edges as interactions. Edge weights indicate interaction reliability. Identifying protein complexes involves detecting dense regions with many connections or high-weight edges. Standard clustering methods are not ideal for PPI networks because proteins may belong to multiple complexes. ClusterONE addresses this by using a cohesiveness measure that considers both internal and boundary weights of a group. Cohesiveness is defined as the ratio of internal to total weight (including a penalty term for uncertainty). The algorithm grows groups from seed proteins, iteratively adding or removing vertices to maximize cohesiveness. Groups are then merged based on overlap scores and filtered by size and density. ClusterONE was tested on five yeast PPI data sets, including weighted and unweighted networks. It outperformed other methods in matching reference complexes and provided better one-to-one mappings. MCL was the closest in performance but could not handle overlaps. ClusterONE also showed higher co-localization and overrepresentation scores compared to MCL, indicating better biological relevance. The method was evaluated using a gold standard of reference complexes from the MIPS catalog and SGD. The maximum matching ratio (MMR) was used to assess performance, with higher values indicating better accuracy. ClusterONE also used co-localization scores and overrepresentation analysis to assess biological relevance. ClusterONE is implemented as a Java application available for free. It can be used in standalone mode or as a plugin for Cytoscape and ProCope. The algorithm allows users to detect protein complexes from a set of seed proteins and provides feedback on complex quality during refinement. The implementation is efficient and user-friendly, making it accessible for the scientific community.
Reach us at info@study.space