The Significance of Digital Gene Expression Profiles

The Significance of Digital Gene Expression Profiles

1997 | Stéphane Audic and Jean-Michel Claverie
This study presents a statistical framework for analyzing digital gene expression profiles, focusing on the reliability of data derived from random sampling of cDNA clones. The authors demonstrate how random fluctuations and sampling size influence the detection of differentially expressed genes. They establish a rigorous significance test to determine whether observed differences in sequence tag counts are statistically significant, and apply this test to publicly available transcript profiles. The test links the threshold for selecting putatively regulated genes to the risk of false positives. The results show that digital Northern data can be used within certain limits, and that the number of tags required for reliable inference depends on the sampling size and the abundance of the gene. The study compares two main approaches for analyzing gene expression: analog methods based on hybridization to cDNA arrays or oligonucleotide chips, and digital methods based on sequence tags. The digital method, which uses sequence tags generated from cDNA libraries, is shown to be effective for large-scale analysis. The authors derive a probability distribution for the occurrence of rare events in duplicate experiments, which is used to calculate confidence intervals for tag counts. These intervals help determine whether observed differences in tag counts are statistically significant. The study also compares the proposed method with Fisher's exact test, showing that the new method is more sensitive and less conservative. It highlights the importance of considering the sampling size and the number of tags when interpreting digital gene expression data. The authors emphasize that while the method is effective, it is important to account for the possibility of false positives when selecting candidate genes. The study concludes that the proposed statistical framework provides a reliable way to analyze digital gene expression profiles and assess the significance of differential expression.This study presents a statistical framework for analyzing digital gene expression profiles, focusing on the reliability of data derived from random sampling of cDNA clones. The authors demonstrate how random fluctuations and sampling size influence the detection of differentially expressed genes. They establish a rigorous significance test to determine whether observed differences in sequence tag counts are statistically significant, and apply this test to publicly available transcript profiles. The test links the threshold for selecting putatively regulated genes to the risk of false positives. The results show that digital Northern data can be used within certain limits, and that the number of tags required for reliable inference depends on the sampling size and the abundance of the gene. The study compares two main approaches for analyzing gene expression: analog methods based on hybridization to cDNA arrays or oligonucleotide chips, and digital methods based on sequence tags. The digital method, which uses sequence tags generated from cDNA libraries, is shown to be effective for large-scale analysis. The authors derive a probability distribution for the occurrence of rare events in duplicate experiments, which is used to calculate confidence intervals for tag counts. These intervals help determine whether observed differences in tag counts are statistically significant. The study also compares the proposed method with Fisher's exact test, showing that the new method is more sensitive and less conservative. It highlights the importance of considering the sampling size and the number of tags when interpreting digital gene expression data. The authors emphasize that while the method is effective, it is important to account for the possibility of false positives when selecting candidate genes. The study concludes that the proposed statistical framework provides a reliable way to analyze digital gene expression profiles and assess the significance of differential expression.
Reach us at info@futurestudyspace.com