2014, Vol. 42, Database issue | Ana Kozomara and Sam Griffiths-Jones*
The article describes an update to the miRBase database, which is the primary repository for microRNA (miRNA) sequences and annotations. The latest release (v20, June 2013) includes 24521 miRNA loci from 206 species, producing 30424 mature miRNA products. With the increasing number of novel miRNAs discovered through small RNA deep sequencing, maintaining the quality of miRNA data has become a significant challenge. To address this, miRBase has implemented new methods to assign confidence levels to miRNA entries based on deep sequencing data patterns. A high confidence subset of miRNA entries is now available alongside the complete collection, allowing users to distinguish between high and low confidence annotations.
The article also discusses the use of Wikipedia pages embedded on the miRBase website to encourage community contributions of textual and functional information about miRNAs. This system allows users to 'like' or 'dislike' specific miRNAs and provide additional information, which helps refine the annotations.
The development of the high confidence set involved analyzing the patterns of reads mapping to miRNA hairpin loci. Criteria were established to determine high confidence annotations, including the presence of sufficient reads mapping to both mature miRNA strands, consistent processing patterns, and appropriate folding energy. These criteria were applied to miRNAs from 38 species, resulting in 1761 high confidence miRNA loci, representing 22% of the total.
The article highlights the importance of community input in improving miRNA annotations and the use of existing databases and tools to enhance the quality of miRNA data. Future developments include using multiple confidence levels and integrating text-mining methods to extract biological information from literature. The miRBase database remains freely available, with a focus on improving data quality and user engagement through community contributions.The article describes an update to the miRBase database, which is the primary repository for microRNA (miRNA) sequences and annotations. The latest release (v20, June 2013) includes 24521 miRNA loci from 206 species, producing 30424 mature miRNA products. With the increasing number of novel miRNAs discovered through small RNA deep sequencing, maintaining the quality of miRNA data has become a significant challenge. To address this, miRBase has implemented new methods to assign confidence levels to miRNA entries based on deep sequencing data patterns. A high confidence subset of miRNA entries is now available alongside the complete collection, allowing users to distinguish between high and low confidence annotations.
The article also discusses the use of Wikipedia pages embedded on the miRBase website to encourage community contributions of textual and functional information about miRNAs. This system allows users to 'like' or 'dislike' specific miRNAs and provide additional information, which helps refine the annotations.
The development of the high confidence set involved analyzing the patterns of reads mapping to miRNA hairpin loci. Criteria were established to determine high confidence annotations, including the presence of sufficient reads mapping to both mature miRNA strands, consistent processing patterns, and appropriate folding energy. These criteria were applied to miRNAs from 38 species, resulting in 1761 high confidence miRNA loci, representing 22% of the total.
The article highlights the importance of community input in improving miRNA annotations and the use of existing databases and tools to enhance the quality of miRNA data. Future developments include using multiple confidence levels and integrating text-mining methods to extract biological information from literature. The miRBase database remains freely available, with a focus on improving data quality and user engagement through community contributions.