25 July 2024 | Lovro Trgovc-Greif, Hans-Jörg Hellinger, Jean Mainguy, Alexander Pfundner, Dmitriy Frishman, Michael Kiening, Nicole Suzanne Webster, Patrick William Laffy, Michael Feichtinger, Thomas Rattei
VOGDB is a database of virus orthologous groups (VOGs), virus protein families (VFAMs), and virus protein structural folds (VFOLDs). It is designed to address the challenges of grouping viral genes due to their high diversity and rapid evolution. VOGDB uses a multi-layer clustering approach, starting with pairwise sequence comparisons, followed by sequence profile alignments, and finally predicted protein structures to identify increasingly remote similarities. This allows for more sensitive homology searches and improves the prediction of annotations and phylogeny. VOGDB includes both prokaryotic and eukaryotic viruses in the same clustering process, enabling the exploration of evolutionary relationships between these groups. It is freely available at vogdb.org under the CC BY 4.0 license.
VOGDB is updated with every RefSeq release and includes all virus genomes from RefSeq, partially reannotating them to ensure higher quality clusters. The database provides three layers of homologous groups: VOGs (based on sequence similarity), VFAMs (based on HMM alignments), and VFOLDs (based on structural features). Functional annotations are derived from SwissProt and RefSeq databases, and structural classifications are based on SCOPe superfamilies. The homogeneity of annotations and structural classifications is assessed to ensure the quality of clusters.
VOGDB has been compared to similar databases such as COG, pVOG, and PHROG, and shows similar levels of homogeneity. It is particularly useful for metagenomic analysis, where it can help identify viral sequences and estimate their origin. VOGDB also supports bioinformatic workflows by providing HMMs and functional annotations for viral proteins, which can be used for various analyses, including virus genome annotation and metagenomic contig classification. The database is continuously updated and improved, incorporating new computational methods and user feedback. VOGDB is a valuable resource for studying viral genomes and will continue to evolve to meet the needs of the research community.VOGDB is a database of virus orthologous groups (VOGs), virus protein families (VFAMs), and virus protein structural folds (VFOLDs). It is designed to address the challenges of grouping viral genes due to their high diversity and rapid evolution. VOGDB uses a multi-layer clustering approach, starting with pairwise sequence comparisons, followed by sequence profile alignments, and finally predicted protein structures to identify increasingly remote similarities. This allows for more sensitive homology searches and improves the prediction of annotations and phylogeny. VOGDB includes both prokaryotic and eukaryotic viruses in the same clustering process, enabling the exploration of evolutionary relationships between these groups. It is freely available at vogdb.org under the CC BY 4.0 license.
VOGDB is updated with every RefSeq release and includes all virus genomes from RefSeq, partially reannotating them to ensure higher quality clusters. The database provides three layers of homologous groups: VOGs (based on sequence similarity), VFAMs (based on HMM alignments), and VFOLDs (based on structural features). Functional annotations are derived from SwissProt and RefSeq databases, and structural classifications are based on SCOPe superfamilies. The homogeneity of annotations and structural classifications is assessed to ensure the quality of clusters.
VOGDB has been compared to similar databases such as COG, pVOG, and PHROG, and shows similar levels of homogeneity. It is particularly useful for metagenomic analysis, where it can help identify viral sequences and estimate their origin. VOGDB also supports bioinformatic workflows by providing HMMs and functional annotations for viral proteins, which can be used for various analyses, including virus genome annotation and metagenomic contig classification. The database is continuously updated and improved, incorporating new computational methods and user feedback. VOGDB is a valuable resource for studying viral genomes and will continue to evolve to meet the needs of the research community.