MEASURING REPRODUCIBILITY OF HIGH-THROUGHPUT EXPERIMENTS

MEASURING REPRODUCIBILITY OF HIGH-THROUGHPUT EXPERIMENTS

2011, Vol. 5, No. 3, 1752-1779 | BY QUNHUA LI, JAMES B. BROWN, HAIYAN HUANG AND PETER J. BICKEL
The paper introduces a method to measure the reproducibility of findings from high-throughput experiments using a copula mixture model. Unlike traditional scalar measures, the method creates a curve that quantitatively assesses when findings are no longer consistent across replicates. This curve is fitted using a copula mixture model, leading to a reproducibility score called the "irreproducible discovery rate" (IDR), analogous to the false discovery rate (FDR). The IDR allows for the principled setting of thresholds to assess reproducibility and combine replicates. The method is applicable to various situations and can handle both probabilistic and heuristic-based scores. It is demonstrated in a ChIP-seq experiment and evaluated through simulations. The method uses a graphical tool to visualize the loss of consistency and a copula mixture model to classify signals into reproducible and irreproducible groups. The IDR is defined as the probability that a signal is irreproducible, and a selection procedure is developed to rank and select signals. The method is compared with existing methods in simulations and shown to be effective in identifying reproducible signals. The approach is general and can be applied to any ranking system that produces scores without ties. The method is robust to violations of model assumptions and provides a reproducibility-based criterion for setting selection thresholds. The paper also discusses the theoretical properties of the method and provides a heuristic justification for its optimality.The paper introduces a method to measure the reproducibility of findings from high-throughput experiments using a copula mixture model. Unlike traditional scalar measures, the method creates a curve that quantitatively assesses when findings are no longer consistent across replicates. This curve is fitted using a copula mixture model, leading to a reproducibility score called the "irreproducible discovery rate" (IDR), analogous to the false discovery rate (FDR). The IDR allows for the principled setting of thresholds to assess reproducibility and combine replicates. The method is applicable to various situations and can handle both probabilistic and heuristic-based scores. It is demonstrated in a ChIP-seq experiment and evaluated through simulations. The method uses a graphical tool to visualize the loss of consistency and a copula mixture model to classify signals into reproducible and irreproducible groups. The IDR is defined as the probability that a signal is irreproducible, and a selection procedure is developed to rank and select signals. The method is compared with existing methods in simulations and shown to be effective in identifying reproducible signals. The approach is general and can be applied to any ranking system that produces scores without ties. The method is robust to violations of model assumptions and provides a reproducibility-based criterion for setting selection thresholds. The paper also discusses the theoretical properties of the method and provides a heuristic justification for its optimality.
Reach us at info@study.space
[slides] Measuring reproducibility of high-throughput experiments | StudySpace