Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

11 February 2021 | Unknown Author
The NHLBI TOPMed Program sequenced 53,831 diverse genomes to better understand the genetic basis of heart, lung, blood, and sleep disorders. The study identified over 410 million genetic variants, including 381 million single-nucleotide variants (SNVs) and 29 million insertions/deletions (indels). Most variants are rare, with 97% having a frequency of less than 1%, and 46% being singletons, present in only one individual. These rare variants offer insights into mutation processes and human evolutionary history. The extensive catalog of genetic variation provides opportunities to explore the role of rare and noncoding variants in phenotypic variation. Combining TOPMed haplotypes with modern imputation methods improves genome-wide association study power, enabling analysis of variants as low as 0.01% frequency. The study used high-coverage whole-genome sequencing (WGS) and advanced quality control methods to ensure data reliability. The data are available through dbGaP, with over 130,000 TOPMed samples now accessible. The program includes over 80 studies, 1,000 investigators, and 30 working groups, with participants from diverse ethnic backgrounds. The study identified 228,966 putative loss-of-function (pLOF) variants in 18,493 genes, with a high proportion of singletons. These variants are enriched in genes related to human disease and functional constraints. The study also identified patterns of genetic variation across the genome, showing that most variation is rare and located in noncoding regions. Some regions, such as chromosome 8p, have high levels of variation. The study found that rare variants are more common in certain genomic regions, such as promoters and 5' untranslated regions. The study also identified retained ancestral sequences, which may have been deleted in some human lineages. The study analyzed the distribution of genetic variation and found that common and rare variation are significantly correlated. However, there are outliers, such as the major histocompatibility complex (MHC), which has high common variation but not necessarily high rare variation. The study also identified patterns of mutation processes, such as trans-lesion synthesis and maternally derived C>G mutation clusters. The study also analyzed the distribution of genetic variation in the CYP2D6 gene, which is involved in drug metabolism. The study identified 99 alleles, including 33 novel ones, representing increased, decreased, or loss of function. The study also examined heterozygosity and rare variant sharing, finding that African American and Caribbean populations have the highest heterozygosity, while Asian populations have the lowest but highest singleton counts. The study also identified rare variant sharing between population groups, with the Amish showing unique patterns of rare variant sharing. The study also analyzed the distribution of genetic variation in differentThe NHLBI TOPMed Program sequenced 53,831 diverse genomes to better understand the genetic basis of heart, lung, blood, and sleep disorders. The study identified over 410 million genetic variants, including 381 million single-nucleotide variants (SNVs) and 29 million insertions/deletions (indels). Most variants are rare, with 97% having a frequency of less than 1%, and 46% being singletons, present in only one individual. These rare variants offer insights into mutation processes and human evolutionary history. The extensive catalog of genetic variation provides opportunities to explore the role of rare and noncoding variants in phenotypic variation. Combining TOPMed haplotypes with modern imputation methods improves genome-wide association study power, enabling analysis of variants as low as 0.01% frequency. The study used high-coverage whole-genome sequencing (WGS) and advanced quality control methods to ensure data reliability. The data are available through dbGaP, with over 130,000 TOPMed samples now accessible. The program includes over 80 studies, 1,000 investigators, and 30 working groups, with participants from diverse ethnic backgrounds. The study identified 228,966 putative loss-of-function (pLOF) variants in 18,493 genes, with a high proportion of singletons. These variants are enriched in genes related to human disease and functional constraints. The study also identified patterns of genetic variation across the genome, showing that most variation is rare and located in noncoding regions. Some regions, such as chromosome 8p, have high levels of variation. The study found that rare variants are more common in certain genomic regions, such as promoters and 5' untranslated regions. The study also identified retained ancestral sequences, which may have been deleted in some human lineages. The study analyzed the distribution of genetic variation and found that common and rare variation are significantly correlated. However, there are outliers, such as the major histocompatibility complex (MHC), which has high common variation but not necessarily high rare variation. The study also identified patterns of mutation processes, such as trans-lesion synthesis and maternally derived C>G mutation clusters. The study also analyzed the distribution of genetic variation in the CYP2D6 gene, which is involved in drug metabolism. The study identified 99 alleles, including 33 novel ones, representing increased, decreased, or loss of function. The study also examined heterozygosity and rare variant sharing, finding that African American and Caribbean populations have the highest heterozygosity, while Asian populations have the lowest but highest singleton counts. The study also identified rare variant sharing between population groups, with the Amish showing unique patterns of rare variant sharing. The study also analyzed the distribution of genetic variation in different
Reach us at info@study.space