Ensembl BioMarts: a hub for data retrieval across taxonomic space

Ensembl BioMarts: a hub for data retrieval across taxonomic space

2011 | Rhoda J. Kinsella, Andreas Kähäri, Syed Haider, Jorge Zamora, Glenn Proctor, Giulietta Spudich, Jeff Almeida-King, Daniel Staines, Paul Derwent, Arnaud Kerhornou, Paul Kersey and Paul Flicek
Ensembl BioMarts is a hub for retrieving genomic data across various taxonomic groups. It provides a centralized access point for high-quality gene annotations, variation data, functional and regulatory annotations, and evolutionary relationships from a wide range of species. The Ensembl project, launched in 2000, focuses on chordate species, particularly humans and model organisms like mice, rats, and zebrafish. It supports 56 species, including 52 with comprehensive gene annotations. The Ensembl Genomes project, launched in 2009, extends this coverage to five additional domains: bacteria, fungi, protists, plants, and invertebrate metazoa. It now supports 313 non-vertebrate species. Ensembl BioMarts are built using the Ensembl database schemas and include seven databases, with four visible and three hidden. The visible databases include Ensembl Genes, Ensembl Variation, Ensembl Regulation, and Vega. The hidden databases provide supporting information such as sequence data, ontology data, and genomic features. Additional databases are integrated from projects like PRIDE and Reactome using BioMart federation technology. Ensembl BioMarts offer various tools for data retrieval, including web interfaces, APIs, and software packages. They allow users to query data in multiple formats and access sequence information in usable formats. The BioMart technology enables complex cross-database queries and facilitates data integration with other bioinformatics resources. The article provides several examples of queries that can be performed using Ensembl BioMarts, such as finding genes associated with specific domains, identifying structural variations, and retrieving somatic mutations linked to tumors. It also discusses the future directions of the Ensembl and Ensembl Genomes projects, including the integration of new data types and the move to a new BioMart version. The BioMart interface is a key tool for accessing and querying genomic data, providing a flexible and powerful platform for researchers.Ensembl BioMarts is a hub for retrieving genomic data across various taxonomic groups. It provides a centralized access point for high-quality gene annotations, variation data, functional and regulatory annotations, and evolutionary relationships from a wide range of species. The Ensembl project, launched in 2000, focuses on chordate species, particularly humans and model organisms like mice, rats, and zebrafish. It supports 56 species, including 52 with comprehensive gene annotations. The Ensembl Genomes project, launched in 2009, extends this coverage to five additional domains: bacteria, fungi, protists, plants, and invertebrate metazoa. It now supports 313 non-vertebrate species. Ensembl BioMarts are built using the Ensembl database schemas and include seven databases, with four visible and three hidden. The visible databases include Ensembl Genes, Ensembl Variation, Ensembl Regulation, and Vega. The hidden databases provide supporting information such as sequence data, ontology data, and genomic features. Additional databases are integrated from projects like PRIDE and Reactome using BioMart federation technology. Ensembl BioMarts offer various tools for data retrieval, including web interfaces, APIs, and software packages. They allow users to query data in multiple formats and access sequence information in usable formats. The BioMart technology enables complex cross-database queries and facilitates data integration with other bioinformatics resources. The article provides several examples of queries that can be performed using Ensembl BioMarts, such as finding genes associated with specific domains, identifying structural variations, and retrieving somatic mutations linked to tumors. It also discusses the future directions of the Ensembl and Ensembl Genomes projects, including the integration of new data types and the move to a new BioMart version. The BioMart interface is a key tool for accessing and querying genomic data, providing a flexible and powerful platform for researchers.
Reach us at info@study.space
[slides and audio] Ensembl BioMarts%3A a hub for data retrieval across taxonomic space