miRBase: from microRNA sequences to function

miRBase: from microRNA sequences to function

2019 | Ana Kozomara, Maria Birgaoanu and Sam Griffiths-Jones
miRBase is a public database that catalogs and distributes microRNA (miRNA) gene sequences. The latest release (v22) includes sequences from 271 organisms, with 38,589 hairpin precursor sequences and 48,860 mature miRNAs. The database has been updated to provide more information on the quality of miRNA annotations and their biological functions. Over 1,493 small RNA deep sequencing datasets have been analyzed, mapping 5.5 billion reads to miRNA sequences. These data support the validity of 20-65% of miRNA annotations in well-studied animal genomes, leading to the removal of over 200 sequences. To improve functional information availability, miRBase now provides Gene Ontology (GO) terms annotated against miRNA sequences. A text-mining approach has been used to search for miRNA gene names in open-access articles, identifying over 500,000 sentences containing miRNA names. These sentences are scored for functional information and linked to 12,519 miRNA entries. The sentences and word clouds built from them provide summaries of miRNA functions. miRBase is publicly available at http://mirbase.org/. The database serves as a primary resource for miRNA sequences and annotations, providing information on sequences, biogenesis, genome coordinates, literature references, and deep sequencing data. It also links to other resources for miRNA targets. The latest release includes 38,589 hairpin precursor sequences from 271 organisms, representing an increase in sequences over the previous release. The database has improved the quality of miRNA annotations by using read mapping data to classify miRNAs as 'high confidence' or 'low confidence'. The 'high confidence' classification is based on criteria such as the presence of mature microRNA sequences from both arms of the hairpin precursor, a 3' overhang of 0-4 nt, and a minimum of 20 overlapping reads per mature microRNA. The 'low confidence' classification is used for sequences with inconsistent read patterns. Users can vote on the validity of miRNA annotations, which has led to the removal of several sequences. The database also provides functional information about miRNAs by mining scientific literature. Over 554,287 sentences from 12,519 miRNAs have been associated with functional information. These sentences are scored based on the inclusion of functional terms and linked to relevant papers. Word clouds built from these sentences provide visual summaries of miRNA functions. For example, the word cloud for Drosophila melanogaster bantam microRNA highlights its role in regulating neural stem cell proliferation and its links with the hippo pathway. The word cloud for hsa-mir-133a-2 highlights its roles in cardiac and skeletal muscle development. A new interface on the mimiRBase is a public database that catalogs and distributes microRNA (miRNA) gene sequences. The latest release (v22) includes sequences from 271 organisms, with 38,589 hairpin precursor sequences and 48,860 mature miRNAs. The database has been updated to provide more information on the quality of miRNA annotations and their biological functions. Over 1,493 small RNA deep sequencing datasets have been analyzed, mapping 5.5 billion reads to miRNA sequences. These data support the validity of 20-65% of miRNA annotations in well-studied animal genomes, leading to the removal of over 200 sequences. To improve functional information availability, miRBase now provides Gene Ontology (GO) terms annotated against miRNA sequences. A text-mining approach has been used to search for miRNA gene names in open-access articles, identifying over 500,000 sentences containing miRNA names. These sentences are scored for functional information and linked to 12,519 miRNA entries. The sentences and word clouds built from them provide summaries of miRNA functions. miRBase is publicly available at http://mirbase.org/. The database serves as a primary resource for miRNA sequences and annotations, providing information on sequences, biogenesis, genome coordinates, literature references, and deep sequencing data. It also links to other resources for miRNA targets. The latest release includes 38,589 hairpin precursor sequences from 271 organisms, representing an increase in sequences over the previous release. The database has improved the quality of miRNA annotations by using read mapping data to classify miRNAs as 'high confidence' or 'low confidence'. The 'high confidence' classification is based on criteria such as the presence of mature microRNA sequences from both arms of the hairpin precursor, a 3' overhang of 0-4 nt, and a minimum of 20 overlapping reads per mature microRNA. The 'low confidence' classification is used for sequences with inconsistent read patterns. Users can vote on the validity of miRNA annotations, which has led to the removal of several sequences. The database also provides functional information about miRNAs by mining scientific literature. Over 554,287 sentences from 12,519 miRNAs have been associated with functional information. These sentences are scored based on the inclusion of functional terms and linked to relevant papers. Word clouds built from these sentences provide visual summaries of miRNA functions. For example, the word cloud for Drosophila melanogaster bantam microRNA highlights its role in regulating neural stem cell proliferation and its links with the hippo pathway. The word cloud for hsa-mir-133a-2 highlights its roles in cardiac and skeletal muscle development. A new interface on the mi
Reach us at info@study.space
[slides] miRBase%3A from microRNA sequences to function | StudySpace