Received on August 14, 2001; revised on June 20, 2002; September 5, 2002; accepted on September 9, 2002 | Anat Reiner*, Daniel Yekutieli and Yoav Benjamini
The paper addresses the challenge of identifying differentially expressed genes in DNA microarray data, where the number of genes tested can be very large, leading to an increased risk of false positives. The authors propose and evaluate four false discovery rate (FDR) controlling procedures to address this issue. These procedures are designed to account for the dependency structure among test statistics, which is common in microarray data due to co-regulation and measurement errors. The first procedure is the Benjamini-Hochberg (BH) procedure, which controls the FDR under the assumption of independent test statistics. The second and third procedures modify the BH procedure to handle positively dependent test statistics by estimating the number of true null hypotheses. The fourth procedure uses resampling to estimate the joint distribution of the test statistics and control the FDR. The performance of these procedures is compared using simulated microarray data, and the results show that all four procedures effectively control the FDR at the desired level while retaining higher power compared to family-wise error rate (FWER) controlling procedures. The paper also discusses the practical implementation of these procedures and provides an R program for adjusting p-values using FDR controlling procedures.The paper addresses the challenge of identifying differentially expressed genes in DNA microarray data, where the number of genes tested can be very large, leading to an increased risk of false positives. The authors propose and evaluate four false discovery rate (FDR) controlling procedures to address this issue. These procedures are designed to account for the dependency structure among test statistics, which is common in microarray data due to co-regulation and measurement errors. The first procedure is the Benjamini-Hochberg (BH) procedure, which controls the FDR under the assumption of independent test statistics. The second and third procedures modify the BH procedure to handle positively dependent test statistics by estimating the number of true null hypotheses. The fourth procedure uses resampling to estimate the joint distribution of the test statistics and control the FDR. The performance of these procedures is compared using simulated microarray data, and the results show that all four procedures effectively control the FDR at the desired level while retaining higher power compared to family-wise error rate (FWER) controlling procedures. The paper also discusses the practical implementation of these procedures and provides an R program for adjusting p-values using FDR controlling procedures.