April 2012 | Mette V. Larsen, Salvatore Cosentino, Simon Rasmussen, Carsten Friis, Henrik Hasman, Rasmus Lykke Marvig, Lars Jelsbak, Thomas Sicheritz-Pontén, David W. Ussery, Frank M. Aarestrup, Ole Lund
The article presents a web-based method for multilocus sequence typing (MLST) of 66 bacterial species using whole-genome sequencing (WGS) data. The method allows for the identification of sequence types (STs) of bacterial isolates based on WGS data, enabling accurate strain identification. The method uses short sequence reads from four sequencing platforms or preassembled genomes. It downloads updates from MLST databases monthly and determines the best-matching MLST alleles using a BLAST-based ranking method. The sequence type is determined by the combination of identified alleles. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, short sequence reads from 387 isolates covering 10 schemes, and a small test set of short sequence reads from 29 isolates. The method is publicly available at www.cbs.dtu.dk/services/MLST.
Accurate strain identification is essential for anyone working with bacteria. MLST is considered the "gold standard" of typing for many species, but it is traditionally expensive and time-consuming. As WGS costs continue to decline, it becomes increasingly available. The new challenge is to extract relevant information from large data sets to allow for comparison over time and between laboratories. The method enables investigators to determine the sequence types of their isolates based on WGS data. The method is publicly available at www.cbs.dtu.dk/services/MLST.
MLST was first developed for Neisseria meningitidis in 1998 to overcome the poor reproducibility between laboratories of older molecular typing schemes. The principle behind MLST is to identify internal nucleotide sequences of approximately 400 to 500 bp in multiple housekeeping genes. Unique sequences (alleles) are assigned a random integer number, and a unique combination of alleles at each locus, an "allelic profile," specifies the sequence type (ST). Following the introduction of the Neisseria MLST scheme, MLST has been considered the "gold standard" of typing, and additional schemes have been developed for bacterial and fungal species. The MLST allele sequences and ST profile tables are stored in curated databases hosted at different sites around the world.
Traditionally, MLST starts with a PCR amplification step using primers specific for the loci of the MLST scheme, followed by Sanger sequencing. The procedure is both costly and time-consuming. In this new era of high-throughput sequencing, it may be more rational to use WGS data for typing. The cost of DNA sequencing has steadily gone down, and the development of next- and third-generation sequencing methods has provided equally great reductions in equipment investments, making the technology accessible to individual investigators and routine clinical and microbial laboratories. The challenge, however, is to extract the relevant information from the large amount of data generated by these techniques. To allow comparison with results obtained by other commonly used technologies and withThe article presents a web-based method for multilocus sequence typing (MLST) of 66 bacterial species using whole-genome sequencing (WGS) data. The method allows for the identification of sequence types (STs) of bacterial isolates based on WGS data, enabling accurate strain identification. The method uses short sequence reads from four sequencing platforms or preassembled genomes. It downloads updates from MLST databases monthly and determines the best-matching MLST alleles using a BLAST-based ranking method. The sequence type is determined by the combination of identified alleles. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, short sequence reads from 387 isolates covering 10 schemes, and a small test set of short sequence reads from 29 isolates. The method is publicly available at www.cbs.dtu.dk/services/MLST.
Accurate strain identification is essential for anyone working with bacteria. MLST is considered the "gold standard" of typing for many species, but it is traditionally expensive and time-consuming. As WGS costs continue to decline, it becomes increasingly available. The new challenge is to extract relevant information from large data sets to allow for comparison over time and between laboratories. The method enables investigators to determine the sequence types of their isolates based on WGS data. The method is publicly available at www.cbs.dtu.dk/services/MLST.
MLST was first developed for Neisseria meningitidis in 1998 to overcome the poor reproducibility between laboratories of older molecular typing schemes. The principle behind MLST is to identify internal nucleotide sequences of approximately 400 to 500 bp in multiple housekeeping genes. Unique sequences (alleles) are assigned a random integer number, and a unique combination of alleles at each locus, an "allelic profile," specifies the sequence type (ST). Following the introduction of the Neisseria MLST scheme, MLST has been considered the "gold standard" of typing, and additional schemes have been developed for bacterial and fungal species. The MLST allele sequences and ST profile tables are stored in curated databases hosted at different sites around the world.
Traditionally, MLST starts with a PCR amplification step using primers specific for the loci of the MLST scheme, followed by Sanger sequencing. The procedure is both costly and time-consuming. In this new era of high-throughput sequencing, it may be more rational to use WGS data for typing. The cost of DNA sequencing has steadily gone down, and the development of next- and third-generation sequencing methods has provided equally great reductions in equipment investments, making the technology accessible to individual investigators and routine clinical and microbial laboratories. The challenge, however, is to extract the relevant information from the large amount of data generated by these techniques. To allow comparison with results obtained by other commonly used technologies and with