Understanding A human phenome-interactome network of protein complexes implicated in genetic disorders

A human phenome-interactome network of protein complexes implicated in genetic disorders was developed by integrating quality-controlled protein interactions with a computationally derived phenotype similarity score. This approach allows the identification of protein complexes likely to be associated with disease. The network includes 506 protein complexes linked to various diseases, revealing functional relationships between disease-promoting genes. A Bayesian predictor was developed to prioritize candidates in linkage intervals, correctly identifying 298 known disease-causing proteins and providing novel candidates for 870 intervals. The predictor uses text mining to assess phenotypic similarity and ranks protein complexes based on their association with disease-related phenotypes. The method involves constructing a human protein interaction network by combining data from multiple databases and model organisms. A scoring system was developed to measure phenotypic similarity using text mining techniques and the Unified Medical Language System (UMLS). This system was validated by comparing scores with manually curated OMIM records, showing a strong correlation between scores and phenotypic overlap. The predictor was tested on 1,404 linkage intervals, achieving high precision and recall. It correctly identified 298 known disease genes and provided 113 novel candidates for 91 intervals. The method was also validated using unbiased large-scale data, showing comparable performance. The results indicate that the predictor can accurately identify disease genes, even when data is limited. Case studies demonstrated the method's effectiveness in identifying novel disease genes. For example, in retinitis pigmentosa, the predictor identified LOC130951 as a candidate, which interacts with CRX, a known gene involved in the disease. In epithelial ovarian cancer, FANCD2 was identified as a candidate, part of a complex with BRCA2 and BRCA1. In inflammatory bowel disease, RIPK1 was identified as a candidate, involved in inflammatory responses. In amyotrophic lateral sclerosis, IARS was identified as a candidate, associated with familial ALS. The method's success is attributed to the integration of experimental protein interaction data with a phenotype similarity scheme, allowing the identification of high-confidence protein complexes. The results highlight the value of data mining and integrating interaction data across multiple organisms for positional candidate prioritization. The method provides a valuable resource for further research into disease mechanisms and gene function.A human phenome-interactome network of protein complexes implicated in genetic disorders was developed by integrating quality-controlled protein interactions with a computationally derived phenotype similarity score. This approach allows the identification of protein complexes likely to be associated with disease. The network includes 506 protein complexes linked to various diseases, revealing functional relationships between disease-promoting genes. A Bayesian predictor was developed to prioritize candidates in linkage intervals, correctly identifying 298 known disease-causing proteins and providing novel candidates for 870 intervals. The predictor uses text mining to assess phenotypic similarity and ranks protein complexes based on their association with disease-related phenotypes. The method involves constructing a human protein interaction network by combining data from multiple databases and model organisms. A scoring system was developed to measure phenotypic similarity using text mining techniques and the Unified Medical Language System (UMLS). This system was validated by comparing scores with manually curated OMIM records, showing a strong correlation between scores and phenotypic overlap. The predictor was tested on 1,404 linkage intervals, achieving high precision and recall. It correctly identified 298 known disease genes and provided 113 novel candidates for 91 intervals. The method was also validated using unbiased large-scale data, showing comparable performance. The results indicate that the predictor can accurately identify disease genes, even when data is limited. Case studies demonstrated the method's effectiveness in identifying novel disease genes. For example, in retinitis pigmentosa, the predictor identified LOC130951 as a candidate, which interacts with CRX, a known gene involved in the disease. In epithelial ovarian cancer, FANCD2 was identified as a candidate, part of a complex with BRCA2 and BRCA1. In inflammatory bowel disease, RIPK1 was identified as a candidate, involved in inflammatory responses. In amyotrophic lateral sclerosis, IARS was identified as a candidate, associated with familial ALS. The method's success is attributed to the integration of experimental protein interaction data with a phenotype similarity scheme, allowing the identification of high-confidence protein complexes. The results highlight the value of data mining and integrating interaction data across multiple organisms for positional candidate prioritization. The method provides a valuable resource for further research into disease mechanisms and gene function.

A human phenome-interactome network of protein complexes implicated in genetic disorders

MARCH 2007 | Kasper Lage, E Olof Karlberg, Zenia M Størling, Páll Í Ólason, Anders G Pedersen, Olga Rigina, Anders M Hinsby, Zeynep Tümer, Flemming Pociot, Niels Tommerup, Yves Moreau & Søren Brunak