2017 | Baofeng Jia, Amogelang R. Raphenya, Brian Alcock, Nicholas Waglechner, Peiyao Guo, Kara K. Tsang, Briony A. Lago, Biren M. Dave, Sheldon Pereira, Arjun N. Sharma, Sachin Doshi, Mélanie Courtot, Raymond Lo, Laura E. Williams, Jonathan G. Frye, Tariq Elsayegh, Daim Sardar, Erin L. Westman, Andrew C. Pawlowski, Timothy A. Johnson, Fiona S.L. Brinkman, Gerard D. Wright, Andrew G. McArthur
The Comprehensive Antibiotic Resistance Database (CARD) is a manually curated resource that provides high-quality reference data on the molecular basis of antimicrobial resistance (AMR). It is ontologically structured and model-centric, covering a wide range of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven, and acquired resistance. CARD is built upon the Antibiotic Resistance Ontology (ARO), a custom-built, interconnected, and hierarchical controlled vocabulary that allows advanced data sharing and organization. Its design enables the development of genome analysis tools, such as the Resistance Gene Identifier (RGI), which predicts resistance from raw genome sequences. Recent improvements include extensive curation of additional reference sequences and mutations, the development of a unique Model Ontology and accompanying AMR detection models, new visualization tools, and expansion of the RGI for detecting emerging AMR threats. CARD curation is updated monthly based on manual literature curation, computational text mining, and genome analysis.
AMR research and surveillance require gathering data across biological scales, from molecules to populations, with parallel development of new data analysis and sharing paradigms. Construction of this 'Big Data' infrastructure will lead to data-driven predictions that complement traditional research methodologies. However, multidisciplinary understanding is difficult due to the increasing volume of literature, making biocuration an increasingly important part of biomedical research. CARD seeks to fulfill this need for AMR research and surveillance.
Currently, CARD is one of the most extensive AMR sequence databases, along with ARG-ANNOT. At the core of CARD is the ARO, a controlled vocabulary for describing antimicrobial molecules and their targets, resistance mechanisms, genes, and mutations. CARD also includes the RGI software, which predicts antibiotic resistance genes from genome sequence data. With continued biocuration and algorithm development, CARD provides a bioinformatics resource for AMR surveillance in healthcare, agricultural, and environmental settings.
The ARO has grown significantly since 2013, with over 3500 ontology terms covering the breadth of AMR mechanisms, supported by over 2000 publications. A new Model Ontology (MO) has been developed, along with AMR detection models. These models include Protein Homolog and Protein Variant models, which detect AMR protein sequences based on similarity to curated reference sequences and curated sets of AMR-conferring mutations. CARD now includes 2260 AMR detection models, with 2102 Protein Homolog models and 92 Protein Variant models.
Improved curation processes include manual tracking of published AMR literature and the use of custom text-mining algorithms to prioritize scientific literature for biocuration. CARD also includes two bioinformatic tools: a standard BLAST for searching CARD reference sequences and the RGI for predicting complete resistome from genome sequences. The RGI provides preliminary annotation of DNA or protein sequences based on data available in CARD and supports two detection model types (Protein Homolog and Protein Variant) and threeThe Comprehensive Antibiotic Resistance Database (CARD) is a manually curated resource that provides high-quality reference data on the molecular basis of antimicrobial resistance (AMR). It is ontologically structured and model-centric, covering a wide range of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven, and acquired resistance. CARD is built upon the Antibiotic Resistance Ontology (ARO), a custom-built, interconnected, and hierarchical controlled vocabulary that allows advanced data sharing and organization. Its design enables the development of genome analysis tools, such as the Resistance Gene Identifier (RGI), which predicts resistance from raw genome sequences. Recent improvements include extensive curation of additional reference sequences and mutations, the development of a unique Model Ontology and accompanying AMR detection models, new visualization tools, and expansion of the RGI for detecting emerging AMR threats. CARD curation is updated monthly based on manual literature curation, computational text mining, and genome analysis.
AMR research and surveillance require gathering data across biological scales, from molecules to populations, with parallel development of new data analysis and sharing paradigms. Construction of this 'Big Data' infrastructure will lead to data-driven predictions that complement traditional research methodologies. However, multidisciplinary understanding is difficult due to the increasing volume of literature, making biocuration an increasingly important part of biomedical research. CARD seeks to fulfill this need for AMR research and surveillance.
Currently, CARD is one of the most extensive AMR sequence databases, along with ARG-ANNOT. At the core of CARD is the ARO, a controlled vocabulary for describing antimicrobial molecules and their targets, resistance mechanisms, genes, and mutations. CARD also includes the RGI software, which predicts antibiotic resistance genes from genome sequence data. With continued biocuration and algorithm development, CARD provides a bioinformatics resource for AMR surveillance in healthcare, agricultural, and environmental settings.
The ARO has grown significantly since 2013, with over 3500 ontology terms covering the breadth of AMR mechanisms, supported by over 2000 publications. A new Model Ontology (MO) has been developed, along with AMR detection models. These models include Protein Homolog and Protein Variant models, which detect AMR protein sequences based on similarity to curated reference sequences and curated sets of AMR-conferring mutations. CARD now includes 2260 AMR detection models, with 2102 Protein Homolog models and 92 Protein Variant models.
Improved curation processes include manual tracking of published AMR literature and the use of custom text-mining algorithms to prioritize scientific literature for biocuration. CARD also includes two bioinformatic tools: a standard BLAST for searching CARD reference sequences and the RGI for predicting complete resistome from genome sequences. The RGI provides preliminary annotation of DNA or protein sequences based on data available in CARD and supports two detection model types (Protein Homolog and Protein Variant) and three