LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants

LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants

2 July 2015 | Mitchell J. Machiela* and Stephen J. Chanock
LDlink is a web-based tool for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. It allows users to query single nucleotide polymorphisms (SNPs) in population groups of interest to generate haplotype tables and interactive plots. The tool is designed for ease of use, query flexibility, and interactive visualization of results. It utilizes phase 3 haplotype data from the 1000 Genomes Project to calculate pairwise metrics of linkage disequilibrium (LD), search for proxies in high LD, and enumerate all observed haplotypes. LDlink is tailored for investigators interested in mapping common and uncommon disease susceptibility loci by focusing on output linking correlated alleles and highlighting putative functional variants. LDlink is a free and publicly available web tool accessible at http://analysistools.nci.nih.gov/LDlink/. It includes modules such as LDhap, LDmatrix, LDpair, and LDproxy, which utilize reference haplotypes from 26 different population groups in the 1000 Genomes Project. These modules produce haplotype tables and interactive plots, and integrate expanded population reference sets, updated functional annotations, and interactive output to explore possible functional variants in high LD. The tool is flexible to allow for any combination of super or sub-population as input based on the investigator's interest. Genetic reference data for LDlink originates from the Phase 3 release of the 1000G project. The release contains over 5000 haplotypes from individuals spanning 26 ancestral population groups. Statistical phasing techniques of the genotyped data allow for the construction of extended haplotypes that are available for public download from the 1000G ftp site in VCF format. The genotyped set is complete with all individuals having called genotypes at every included locus. Sample panel files map each individual to their respective ancestral subpopulation of membership. Available modules include LDhap, LDmatrix, LDpair, and LDproxy. LDhap calculates population-specific haplotype frequencies of all haplotypes observed for a list of query SNPs. LDmatrix creates interactive heat map matrices of pairwise LD statistics from a list of SNP RS numbers and a specified population. LDpair generates 2 by 2 tables of observed haplotypes for a pair of SNPs and reports haplotype and allele frequencies as well as measures of linkage disequilibrium. The LDproxy module interactively explores proxy and putatively functional SNPs for a query SNP in a selected 1000G population. Interactive plots show linkage disequilibrium over genomic distance where data point size, color and labels are used to highlight minor allele frequency and predicted function.LDlink is a web-based tool for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. It allows users to query single nucleotide polymorphisms (SNPs) in population groups of interest to generate haplotype tables and interactive plots. The tool is designed for ease of use, query flexibility, and interactive visualization of results. It utilizes phase 3 haplotype data from the 1000 Genomes Project to calculate pairwise metrics of linkage disequilibrium (LD), search for proxies in high LD, and enumerate all observed haplotypes. LDlink is tailored for investigators interested in mapping common and uncommon disease susceptibility loci by focusing on output linking correlated alleles and highlighting putative functional variants. LDlink is a free and publicly available web tool accessible at http://analysistools.nci.nih.gov/LDlink/. It includes modules such as LDhap, LDmatrix, LDpair, and LDproxy, which utilize reference haplotypes from 26 different population groups in the 1000 Genomes Project. These modules produce haplotype tables and interactive plots, and integrate expanded population reference sets, updated functional annotations, and interactive output to explore possible functional variants in high LD. The tool is flexible to allow for any combination of super or sub-population as input based on the investigator's interest. Genetic reference data for LDlink originates from the Phase 3 release of the 1000G project. The release contains over 5000 haplotypes from individuals spanning 26 ancestral population groups. Statistical phasing techniques of the genotyped data allow for the construction of extended haplotypes that are available for public download from the 1000G ftp site in VCF format. The genotyped set is complete with all individuals having called genotypes at every included locus. Sample panel files map each individual to their respective ancestral subpopulation of membership. Available modules include LDhap, LDmatrix, LDpair, and LDproxy. LDhap calculates population-specific haplotype frequencies of all haplotypes observed for a list of query SNPs. LDmatrix creates interactive heat map matrices of pairwise LD statistics from a list of SNP RS numbers and a specified population. LDpair generates 2 by 2 tables of observed haplotypes for a pair of SNPs and reports haplotype and allele frequencies as well as measures of linkage disequilibrium. The LDproxy module interactively explores proxy and putatively functional SNPs for a query SNP in a selected 1000G population. Interactive plots show linkage disequilibrium over genomic distance where data point size, color and labels are used to highlight minor allele frequency and predicted function.
Reach us at info@study.space
Understanding LDlink%3A a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants