11 OCTOBER 2018 | Clare Bycroft, Colin Freeman, Desislava Petkova, Gavin Band, Lloyd T. Elliott, Kevin Sharp, Allan Motyer, Damjan Vukcevic, Olivier Delaneau, Jared O'Connell, Adrian Cortes, Samantha Welsh, Alan Young, Mark Effingham, Gil McVean, Stephen Leslie, Naomi Allen, Peter Donnelly & Jonathan Marchini
The UK Biobank is a large-scale prospective study that collects deep genetic and phenotypic data from approximately 500,000 individuals in the UK, aged 40-69 at recruitment. The study provides a unique resource with extensive phenotypic and health-related information, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Genome-wide genotype data have been collected for all participants, enabling the discovery of new genetic associations and the genetic basis of complex traits. The study includes centralized analysis of genetic data, including genotype quality, population structure, relatedness, and efficient phasing and genotype imputation, which increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen (HLA) genes was imputed, resulting in the recovery of signals with known associations between HLA alleles and many diseases.
The UK Biobank collects a wide variety of phenotypic information and biological samples for each participant, including electronic consent, socio-demographic and lifestyle data, and physical measurements. Blood, urine, and saliva samples were collected and stored for various assays. The study also includes follow-up data through linkage to health records, with over 14,000 deaths and 79,000 cancer diagnoses recorded as of May 2018. The study is open-access and encourages researchers worldwide to use the data for health-related research in the public interest.
The UK Biobank genetic data includes genotypes for 488,377 participants, assayed using two similar genotyping arrays. The data includes a wide range of markers, including those with known associations with disease, coding variants across various minor allele frequencies, and markers that provide good genome-wide coverage for imputation in European populations. The data has been subjected to quality control procedures, including batch-level quality control, missing rate and heterozygosity metrics, and sex chromosome analysis. The study also includes haplotype estimation and genotype imputation, which increases the number of testable variants to around 96 million.
The UK Biobank also includes ancestral diversity and cryptic relatedness analysis, which helps to account for population structure in genetic studies. The study has identified a large number of related individuals, including 147,731 participants inferred to be related to at least one other person in the cohort. The study also includes HLA allele imputation, which has been validated and used to identify associations between HLA alleles and diseases such as multiple sclerosis.
The UK Biobank has also conducted genome-wide association studies (GWAS) for standing height, comparing results with a meta-analysis from the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. The study found that the UK Biobank data has higher power to detect associations, with many loci reaching genome-wide significance. The study also includesThe UK Biobank is a large-scale prospective study that collects deep genetic and phenotypic data from approximately 500,000 individuals in the UK, aged 40-69 at recruitment. The study provides a unique resource with extensive phenotypic and health-related information, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Genome-wide genotype data have been collected for all participants, enabling the discovery of new genetic associations and the genetic basis of complex traits. The study includes centralized analysis of genetic data, including genotype quality, population structure, relatedness, and efficient phasing and genotype imputation, which increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen (HLA) genes was imputed, resulting in the recovery of signals with known associations between HLA alleles and many diseases.
The UK Biobank collects a wide variety of phenotypic information and biological samples for each participant, including electronic consent, socio-demographic and lifestyle data, and physical measurements. Blood, urine, and saliva samples were collected and stored for various assays. The study also includes follow-up data through linkage to health records, with over 14,000 deaths and 79,000 cancer diagnoses recorded as of May 2018. The study is open-access and encourages researchers worldwide to use the data for health-related research in the public interest.
The UK Biobank genetic data includes genotypes for 488,377 participants, assayed using two similar genotyping arrays. The data includes a wide range of markers, including those with known associations with disease, coding variants across various minor allele frequencies, and markers that provide good genome-wide coverage for imputation in European populations. The data has been subjected to quality control procedures, including batch-level quality control, missing rate and heterozygosity metrics, and sex chromosome analysis. The study also includes haplotype estimation and genotype imputation, which increases the number of testable variants to around 96 million.
The UK Biobank also includes ancestral diversity and cryptic relatedness analysis, which helps to account for population structure in genetic studies. The study has identified a large number of related individuals, including 147,731 participants inferred to be related to at least one other person in the cohort. The study also includes HLA allele imputation, which has been validated and used to identify associations between HLA alleles and diseases such as multiple sclerosis.
The UK Biobank has also conducted genome-wide association studies (GWAS) for standing height, comparing results with a meta-analysis from the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. The study found that the UK Biobank data has higher power to detect associations, with many loci reaching genome-wide significance. The study also includes