2014 | Thorfinn Sand Korneliussen1*, Anders Albrechtsen2 and Rasmus Nielsen1,3
The paper introduces ANGSD, a multithreaded program suite designed for analyzing next-generation sequencing (NGS) data. ANGSD is capable of calculating various summary statistics and performing association mapping and population genetic analyses by working directly on raw sequencing data or using genotype likelihoods (GLs). The program supports multiple input formats, including BAM and imputed BEAGLE genotype probability files, and allows users to choose between different methods for intermediate analysis. Key features of ANGSD include the ability to handle low to medium coverage data efficiently, incorporate statistical uncertainty in genotype calling, and perform analyses that are not available in other software. The program is open source, tested on GNU/Linux systems, and can be downloaded from http://www.ppgpen.dk/angsd. The paper also discusses the importance of using GLs for low-coverage data and provides examples of how ANGSD can be used for SNP discovery, genotype calling, and population genetic analyses, including joint site frequency spectrum estimation and ABBA-BABA D-statistic analysis.The paper introduces ANGSD, a multithreaded program suite designed for analyzing next-generation sequencing (NGS) data. ANGSD is capable of calculating various summary statistics and performing association mapping and population genetic analyses by working directly on raw sequencing data or using genotype likelihoods (GLs). The program supports multiple input formats, including BAM and imputed BEAGLE genotype probability files, and allows users to choose between different methods for intermediate analysis. Key features of ANGSD include the ability to handle low to medium coverage data efficiently, incorporate statistical uncertainty in genotype calling, and perform analyses that are not available in other software. The program is open source, tested on GNU/Linux systems, and can be downloaded from http://www.ppgpen.dk/angsd. The paper also discusses the importance of using GLs for low-coverage data and provides examples of how ANGSD can be used for SNP discovery, genotype calling, and population genetic analyses, including joint site frequency spectrum estimation and ABBA-BABA D-statistic analysis.