Robust rank aggregation for gene list integration and meta-analysis

Robust rank aggregation for gene list integration and meta-analysis

January 12, 2012 | Raivo Kolde, Sven Laur, Priit Adler, Jaak Vilo
This paper introduces a robust rank aggregation (RRA) method for integrating gene lists and performing meta-analysis. The method is designed to handle noisy biological data and provides significance scores for each gene, making it robust to outliers and errors. The RRA algorithm is based on order statistics and computes P-values for each gene to determine its significance in the aggregated list. It is also efficient and can handle partial or incomplete rankings, which is common in genomic data. The RRA method was tested on simulated data and biological datasets. In simulations, it outperformed alternative methods like average rank and the Stuart method in terms of significance scoring and noise resistance. In biological studies, it was used to predict pathway members from knock-out data and to identify transcription factor targets from co-expression data. The results showed that RRA consistently outperformed other methods in identifying significant genes and was robust to noise. The RRA algorithm is implemented as a GNU R package called ROBUSTRANKAGGREG. It is suitable for a wide range of practical situations where there are no good alternatives. The method is robust, efficient, and can handle incomplete rankings, making it a valuable tool for integrating gene lists and performing meta-analysis in genomics. The study also highlights the importance of using statistical models to assess the relevance of results in gene expression meta-analysis. The RRA method provides a reliable and efficient way to combine results from multiple studies, making it a valuable tool for bioinformatics applications.This paper introduces a robust rank aggregation (RRA) method for integrating gene lists and performing meta-analysis. The method is designed to handle noisy biological data and provides significance scores for each gene, making it robust to outliers and errors. The RRA algorithm is based on order statistics and computes P-values for each gene to determine its significance in the aggregated list. It is also efficient and can handle partial or incomplete rankings, which is common in genomic data. The RRA method was tested on simulated data and biological datasets. In simulations, it outperformed alternative methods like average rank and the Stuart method in terms of significance scoring and noise resistance. In biological studies, it was used to predict pathway members from knock-out data and to identify transcription factor targets from co-expression data. The results showed that RRA consistently outperformed other methods in identifying significant genes and was robust to noise. The RRA algorithm is implemented as a GNU R package called ROBUSTRANKAGGREG. It is suitable for a wide range of practical situations where there are no good alternatives. The method is robust, efficient, and can handle incomplete rankings, making it a valuable tool for integrating gene lists and performing meta-analysis in genomics. The study also highlights the importance of using statistical models to assess the relevance of results in gene expression meta-analysis. The RRA method provides a reliable and efficient way to combine results from multiple studies, making it a valuable tool for bioinformatics applications.
Reach us at info@study.space