24 Sep 2018 | Keith A. Jolley, James E. Bray, Martin C. J. Maiden
The PubMLST.org website hosts open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 microbial species and genera. The Bacterial Isolate Genome Sequence Database (BIGSdb) software, developed in 2010, enables the inclusion of all sequence data levels, from single gene sequences to complete genomes. The BIGSdb platform allows for gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify genes and their variation. This approach supports scalable analysis of microbial population genomics for various applications, including antimicrobial resistance prediction, vaccine antigen cross-reactivity, and functional activities of variants. The platform can include any number of sequences, genetic loci, allelic variants, or schemes, enabling each database to represent an expanding catalog of genetic variation. The BIGSdb software includes a RESTful API for third-party access to data. The PubMLST.org databases employ a bacterial population genomics approach, combining population genetics with genome-wide sequence data to infer links between phenotype and genotype. This approach is especially suited to resolving complex phenotypes such as virulence and antibiotic resistance in bacterial pathogens. The platform supports a wide range of schemes, including conventional MLST, cgMLST, and rMLST, and includes whole genome sequence data for tens of thousands of isolates. The BIGSdb platform has been cited over 990 times and is used for various applications, including surveillance, vaccine development, and evolutionary analysis. The platform supports the integration of private and public data, with authorized users able to upload private data. It also supports international surveillance and is part of a developing ecosystem of independent third-party tools. The PubMLST databases and BIGSdb software originated from the development of the MLST approach in 1998 and have evolved to support open-access, curated, and interpreted data. The platform provides a scalable framework for analyzing microbial population genomics, enabling the exploration of a wide range of biological questions. The integration of diverse data sources and the use of structured datasets, extensive genomic data, and complex query tools provide a platform for investigating a wide range of biological questions. The platform is well positioned to continue serving nomenclatures for this effort, along with extensive collections of structured isolate record data for a wide range of pathogenic and other bacterial species.The PubMLST.org website hosts open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 microbial species and genera. The Bacterial Isolate Genome Sequence Database (BIGSdb) software, developed in 2010, enables the inclusion of all sequence data levels, from single gene sequences to complete genomes. The BIGSdb platform allows for gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify genes and their variation. This approach supports scalable analysis of microbial population genomics for various applications, including antimicrobial resistance prediction, vaccine antigen cross-reactivity, and functional activities of variants. The platform can include any number of sequences, genetic loci, allelic variants, or schemes, enabling each database to represent an expanding catalog of genetic variation. The BIGSdb software includes a RESTful API for third-party access to data. The PubMLST.org databases employ a bacterial population genomics approach, combining population genetics with genome-wide sequence data to infer links between phenotype and genotype. This approach is especially suited to resolving complex phenotypes such as virulence and antibiotic resistance in bacterial pathogens. The platform supports a wide range of schemes, including conventional MLST, cgMLST, and rMLST, and includes whole genome sequence data for tens of thousands of isolates. The BIGSdb platform has been cited over 990 times and is used for various applications, including surveillance, vaccine development, and evolutionary analysis. The platform supports the integration of private and public data, with authorized users able to upload private data. It also supports international surveillance and is part of a developing ecosystem of independent third-party tools. The PubMLST databases and BIGSdb software originated from the development of the MLST approach in 1998 and have evolved to support open-access, curated, and interpreted data. The platform provides a scalable framework for analyzing microbial population genomics, enabling the exploration of a wide range of biological questions. The integration of diverse data sources and the use of structured datasets, extensive genomic data, and complex query tools provide a platform for investigating a wide range of biological questions. The platform is well positioned to continue serving nomenclatures for this effort, along with extensive collections of structured isolate record data for a wide range of pathogenic and other bacterial species.