Rfam: updates to the RNA families database

Rfam: updates to the RNA families database

Published online 25 October 2008 | Paul P. Gardner1*, Jennifer Daub1, John G. Tate1, Eric P. Nawrocki2, Diana L. Kolbe2, Stinus Lindgreen3, Adam C. Wilkinson1, Robert D. Finn1, Sam Griffiths-Jones4, Sean R. Eddy2 and Alex Bateman1
Rfam is a comprehensive database of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary goal of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly in complete genomes, using sensitive BLAST filters combined with CMs. The database has recently been updated to include more sequences and improve the sensitivity and specificity of its search methods. Key improvements include expanding the underlying nucleotide sequence database (RFAMSEQ) to include whole genome shotgun and environmental sequences, replacing NCBI-BLAST with WU-BLAST for higher sensitivity, and applying sequence masks to reduce false positives. Additionally, over 370 families have been expanded through an 'iteration' process, enhancing the species and sequence depth of individual families. The Rfam website has been redesigned to provide better data presentation and more tools for searching and analyzing the data, including interactive and batch search options, taxonomic search tools, and detailed overviews of families. New graphical representations of secondary structures and integration with Wikipedia for community-contributed annotations have also been introduced. Future challenges include keeping up with the rapid discovery of new RNA families and improving the quality and speed of the Infernal software used for CM preparation.Rfam is a comprehensive database of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary goal of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly in complete genomes, using sensitive BLAST filters combined with CMs. The database has recently been updated to include more sequences and improve the sensitivity and specificity of its search methods. Key improvements include expanding the underlying nucleotide sequence database (RFAMSEQ) to include whole genome shotgun and environmental sequences, replacing NCBI-BLAST with WU-BLAST for higher sensitivity, and applying sequence masks to reduce false positives. Additionally, over 370 families have been expanded through an 'iteration' process, enhancing the species and sequence depth of individual families. The Rfam website has been redesigned to provide better data presentation and more tools for searching and analyzing the data, including interactive and batch search options, taxonomic search tools, and detailed overviews of families. New graphical representations of secondary structures and integration with Wikipedia for community-contributed annotations have also been introduced. Future challenges include keeping up with the rapid discovery of new RNA families and improving the quality and speed of the Infernal software used for CM preparation.
Reach us at info@study.space