NCBI GEO: archive for high-throughput functional genomic data

NCBI GEO: archive for high-throughput functional genomic data

2009 | Tanya Barrett*, Dennis B. Troup, Stephen E. Wilhite, Pierre Ledoux, Dmitry Rudnev, Carlos Evangelista, Irene F. Kim, Alexandra Soboleva, Maxim Tomashevsky, Kimberly A. Marshall, Katherine H. Phillippy, Patti M. Sherman, Rolf N. Muertter and Ron Edgar
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. It also hosts other high-throughput functional genomic data, including genome copy number variations, chromatin structure, methylation status, and transcription factor binding. These data are generated by the research community using technologies like microarrays and next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with standards like MIAME. GEO offers tools for exploring, analyzing, and downloading expression data from gene-centric and experiment-centric perspectives. The database holds over 10,000 experiments with 300,000 samples and 16 billion abundance measurements across over 500 organisms. It receives over 60,000 query hits and 10,000 bulk FTP downloads daily and has been cited in over 5,000 manuscripts. The 'Omix' division hosts non-expression data, which has led to the database's name becoming somewhat misleading. Non-expression data are now managed under 'Omix', which denotes a mixture of 'omic data. The submission and download procedures for Omix data are similar to those for GEO. The database also processes high-throughput sequence data, including gene expression, gene regulation, and epigenetics. The GEO database structure includes Platform, Sample, and Series records. Data are stored in a relational MSSQL database partitioned into three entity types. Data are validated upon upload, and submissions are typically approved within 2–5 days. Researchers can update their records and keep them private until a manuscript is published. GEO provides tools for retrieving, exploring, and visualizing data, including cluster heat maps, expression profile charts, and value/probability distribution charts. It also offers programmatic access via E-Utils and bulk download via FTP. GEO continues to develop to enhance data submission and retrieval experiences. It supports diverse data formats and provides standards for data submission. The database and tools are continuously updated to improve data accessibility and usability. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. It also hosts other high-throughput functional genomic data, including genome copy number variations, chromatin structure, methylation status, and transcription factor binding. These data are generated by the research community using technologies like microarrays and next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with standards like MIAME. GEO offers tools for exploring, analyzing, and downloading expression data from gene-centric and experiment-centric perspectives. The database holds over 10,000 experiments with 300,000 samples and 16 billion abundance measurements across over 500 organisms. It receives over 60,000 query hits and 10,000 bulk FTP downloads daily and has been cited in over 5,000 manuscripts. The 'Omix' division hosts non-expression data, which has led to the database's name becoming somewhat misleading. Non-expression data are now managed under 'Omix', which denotes a mixture of 'omic data. The submission and download procedures for Omix data are similar to those for GEO. The database also processes high-throughput sequence data, including gene expression, gene regulation, and epigenetics. The GEO database structure includes Platform, Sample, and Series records. Data are stored in a relational MSSQL database partitioned into three entity types. Data are validated upon upload, and submissions are typically approved within 2–5 days. Researchers can update their records and keep them private until a manuscript is published. GEO provides tools for retrieving, exploring, and visualizing data, including cluster heat maps, expression profile charts, and value/probability distribution charts. It also offers programmatic access via E-Utils and bulk download via FTP. GEO continues to develop to enhance data submission and retrieval experiences. It supports diverse data formats and provides standards for data submission. The database and tools are continuously updated to improve data accessibility and usability. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Reach us at info@study.space