Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences

Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences

August 2011 | Julian M. Catchen, Angel Amores, Paul Hohenlohe, William Cresko, and John H. Postlethwait
Stacks is a software system that uses short-read sequence data to identify and genotype loci in a set of individuals either de novo or by comparison to a reference genome. It can recover thousands of single nucleotide polymorphism (SNP) markers from reduced representation Illumina sequence data, such as RAD-tags, useful for genetic analysis. Stacks can generate markers for ultra-dense genetic linkage maps, facilitate population phylogeography studies, and aid in reference genome assembly. The software uses a maximum likelihood statistical model to identify sequence polymorphisms and distinguish them from sequencing errors. Stacks stores results in a MySQL database and displays them through a web interface that facilitates marker annotation. It can export data as genotypes for JoinMap or R/qtl or as a set of observed haplotypes for a general population. Stacks was developed to build meiotic maps but can be used for nearly any analysis using genomically localized short-read sequences. It processes RAD-seq data to identify loci and their constituent alleles in each individual, creating a Catalog of parental loci. It then matches progeny against the Catalog to determine haplotypes at each locus in every individual. Stacks can also use a reference genome to identify loci and call genotypes. It has been used to construct a zebrafish genetic map using RAD-tag mapping from a doubled haploid mapping panel, demonstrating its efficacy and efficiency in inferring genetic loci and automated genotype calling. Stacks includes component programs written in C++ and Perl, with core algorithms parallelized using OpenMP libraries. It has a web interface implemented in PHP that interacts with a MySQL database. Stacks is available as open source software under the GPL license and can be downloaded from http://creskolab.uoregon.edu/stacks/. It has been tested with simulated RAD-tag data from the stickleback reference genome and has successfully reconstructed a zebrafish genetic map. The software can also process paired-end mini-contigs and other sequence sets, allowing for the association of Stacks markers with additional sequences, including mini-contigs and ESTs. The web-based interface allows for viewing, annotating, and correcting loci in a population. Stacks has been shown to be effective in identifying loci even in the presence of sequencing errors and has been used to generate high-density genetic maps with fewer markers than previously available methods. The software is available for download along with example data, tutorials, and other documentation.Stacks is a software system that uses short-read sequence data to identify and genotype loci in a set of individuals either de novo or by comparison to a reference genome. It can recover thousands of single nucleotide polymorphism (SNP) markers from reduced representation Illumina sequence data, such as RAD-tags, useful for genetic analysis. Stacks can generate markers for ultra-dense genetic linkage maps, facilitate population phylogeography studies, and aid in reference genome assembly. The software uses a maximum likelihood statistical model to identify sequence polymorphisms and distinguish them from sequencing errors. Stacks stores results in a MySQL database and displays them through a web interface that facilitates marker annotation. It can export data as genotypes for JoinMap or R/qtl or as a set of observed haplotypes for a general population. Stacks was developed to build meiotic maps but can be used for nearly any analysis using genomically localized short-read sequences. It processes RAD-seq data to identify loci and their constituent alleles in each individual, creating a Catalog of parental loci. It then matches progeny against the Catalog to determine haplotypes at each locus in every individual. Stacks can also use a reference genome to identify loci and call genotypes. It has been used to construct a zebrafish genetic map using RAD-tag mapping from a doubled haploid mapping panel, demonstrating its efficacy and efficiency in inferring genetic loci and automated genotype calling. Stacks includes component programs written in C++ and Perl, with core algorithms parallelized using OpenMP libraries. It has a web interface implemented in PHP that interacts with a MySQL database. Stacks is available as open source software under the GPL license and can be downloaded from http://creskolab.uoregon.edu/stacks/. It has been tested with simulated RAD-tag data from the stickleback reference genome and has successfully reconstructed a zebrafish genetic map. The software can also process paired-end mini-contigs and other sequence sets, allowing for the association of Stacks markers with additional sequences, including mini-contigs and ESTs. The web-based interface allows for viewing, annotating, and correcting loci in a population. Stacks has been shown to be effective in identifying loci even in the presence of sequencing errors and has been used to generate high-density genetic maps with fewer markers than previously available methods. The software is available for download along with example data, tutorials, and other documentation.
Reach us at info@study.space