2016 | Andrew J. Page, Ben Taylor, Aidan J. Delaney, Jorge Soares, Torsten Seemann, Jacqueline A. Keane and Simon R. Harris
The paper introduces *SNP-sites*, a software tool designed to efficiently extract single nucleotide polymorphisms (SNPs) from multi-FASTA alignments. The tool addresses the limitations of existing methods, which are slow, memory-intensive, and difficult to install. *SNP-sites* is implemented in C and can handle large datasets, such as an 8.3 GB alignment file with 1842 taxa and 22,618 sites, in 267 seconds using 59 MB of RAM and a single CPU core. It supports multiple output formats, including VCF, PHYLIP, and FASTA, and is easily installable through package managers like Debian and Homebrew. The software has been tested on over 20 operating systems and architectures. Performance comparisons with other tools like JVarKit, TrimAl, and PySnpSites show that *SNP-sites* is significantly more efficient in terms of memory usage and running time, making it suitable for large-scale population studies in prokaryotes.The paper introduces *SNP-sites*, a software tool designed to efficiently extract single nucleotide polymorphisms (SNPs) from multi-FASTA alignments. The tool addresses the limitations of existing methods, which are slow, memory-intensive, and difficult to install. *SNP-sites* is implemented in C and can handle large datasets, such as an 8.3 GB alignment file with 1842 taxa and 22,618 sites, in 267 seconds using 59 MB of RAM and a single CPU core. It supports multiple output formats, including VCF, PHYLIP, and FASTA, and is easily installable through package managers like Debian and Homebrew. The software has been tested on over 20 operating systems and architectures. Performance comparisons with other tools like JVarKit, TrimAl, and PySnpSites show that *SNP-sites* is significantly more efficient in terms of memory usage and running time, making it suitable for large-scale population studies in prokaryotes.