This study presents an integrated map of structural variation (SV) in 2,504 human genomes, covering eight SV classes, including balanced and unbalanced variants. The data was generated using short-read DNA sequencing and statistically phased onto haplotype blocks in 26 human populations. The study identifies numerous gene-intersecting SVs with population stratification and describes naturally occurring homozygous gene knockouts, suggesting the dispensability of various human genes. Structural variants are enriched on haplotypes identified by genome-wide association studies and show enrichment for expression quantitative trait loci. The study also reveals significant structural variant complexity at different scales, including genic loci with clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events.
The study highlights the population diversity of SVs, quantifies their functional impact, and emphasizes previously understudied SV classes, including inversions with marked sequence complexity. The SV catalog enhances future studies on structural variant demography, functional impact, and disease association. Structural variants, including deletions, insertions, duplications, and inversions, account for most varying base pairs among human genomes. SVs are implicated in numerous diseases and have been challenging to discover and genotype due to their occurrence in repetitive regions and complex internal structures. Despite advances in methodology and technology, efforts to discover, genotype, and statistically phase all major SV classes have been limited.
The study's objective was to discover and genotype major SV classes in diverse populations and generate a statistically phased reference panel. The study reports an integrated map of 68,818 SVs in unrelated individuals from 26 populations. The resource was constructed by analyzing 1000 Genomes Project phase 3 whole-genome sequencing data and data from orthogonal techniques, including long-read single-molecule sequencing. The study emphasizes the population diversity of SVs, quantifies their functional impact, and highlights previously understudied SV classes, including inversions with marked sequence complexity.
The study's SV release was constructed by mapping Illumina WGS data onto an amended version of the GRCh37 reference assembly using two independent mapping algorithms and performing SV discovery and genotyping using an ensemble of nine different algorithms. The study used several orthogonal experimental platforms for SV set assessment, refinement, and characterization. The study found that 60% of SVs were novel compared to the Database of Genomic Variants, and 71% of SVs were novel compared to previous 1000 Genomes Project releases. The study also found that SVs showed enrichment for rare sites and that the false discovery rate was consistently estimated at ≤5.4% for deletions and duplications.
The study also found that individuals of African ancestry had 27% more heterozygous deletions than individuals from other populations. The study identified 1,075 SVs with VAF >50%, encompassing 5 MbpThis study presents an integrated map of structural variation (SV) in 2,504 human genomes, covering eight SV classes, including balanced and unbalanced variants. The data was generated using short-read DNA sequencing and statistically phased onto haplotype blocks in 26 human populations. The study identifies numerous gene-intersecting SVs with population stratification and describes naturally occurring homozygous gene knockouts, suggesting the dispensability of various human genes. Structural variants are enriched on haplotypes identified by genome-wide association studies and show enrichment for expression quantitative trait loci. The study also reveals significant structural variant complexity at different scales, including genic loci with clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events.
The study highlights the population diversity of SVs, quantifies their functional impact, and emphasizes previously understudied SV classes, including inversions with marked sequence complexity. The SV catalog enhances future studies on structural variant demography, functional impact, and disease association. Structural variants, including deletions, insertions, duplications, and inversions, account for most varying base pairs among human genomes. SVs are implicated in numerous diseases and have been challenging to discover and genotype due to their occurrence in repetitive regions and complex internal structures. Despite advances in methodology and technology, efforts to discover, genotype, and statistically phase all major SV classes have been limited.
The study's objective was to discover and genotype major SV classes in diverse populations and generate a statistically phased reference panel. The study reports an integrated map of 68,818 SVs in unrelated individuals from 26 populations. The resource was constructed by analyzing 1000 Genomes Project phase 3 whole-genome sequencing data and data from orthogonal techniques, including long-read single-molecule sequencing. The study emphasizes the population diversity of SVs, quantifies their functional impact, and highlights previously understudied SV classes, including inversions with marked sequence complexity.
The study's SV release was constructed by mapping Illumina WGS data onto an amended version of the GRCh37 reference assembly using two independent mapping algorithms and performing SV discovery and genotyping using an ensemble of nine different algorithms. The study used several orthogonal experimental platforms for SV set assessment, refinement, and characterization. The study found that 60% of SVs were novel compared to the Database of Genomic Variants, and 71% of SVs were novel compared to previous 1000 Genomes Project releases. The study also found that SVs showed enrichment for rare sites and that the false discovery rate was consistently estimated at ≤5.4% for deletions and duplications.
The study also found that individuals of African ancestry had 27% more heterozygous deletions than individuals from other populations. The study identified 1,075 SVs with VAF >50%, encompassing 5 Mbp