Genome-wide association analyses using high-throughput metabolomics platforms have led to novel insights into the biology of human metabolism1,2,3,4,5,6,7. This detailed knowledge of the genetic... Show moreGenome-wide association analyses using high-throughput metabolomics platforms have led to novel insights into the biology of human metabolism1,2,3,4,5,6,7. This detailed knowledge of the genetic determinants of systemic metabolism has been pivotal for uncovering how genetic pathways influence biological mechanisms and complex diseases8,9,10,11. Here we present a genome-wide association study for 233 circulating metabolic traits quantified by nuclear magnetic resonance spectroscopy in up to 136,016 participants from 33 cohorts. We identify more than 400 independent loci and assign probable causal genes at two-thirds of these using manual curation of plausible biological candidates. We highlight the importance of sample and participant characteristics that can have significant effects on genetic associations. We use detailed metabolic profiling of lipoprotein- and lipid-associated variants to better characterize how known lipid loci and novel loci affect lipoprotein metabolism at a granular level. We demonstrate the translational utility of comprehensively phenotyped molecular data, characterizing the metabolic associations of intrahepatic cholestasis of pregnancy. Finally, we observe substantial genetic pleiotropy for multiple metabolic pathways and illustrate the importance of careful instrument selection in Mendelian randomization analysis, revealing a putative causal relationship between acetone and hypertension. Our publicly available results provide a foundational resource for the community to examine the role of metabolism across diverse diseases. Show less
Lung-function impairment underlies chronic obstructive pulmonary disease (COPD) and predicts mortality. In the largest multi-ancestry genome-wide association meta-analysis of lung function to date,... Show moreLung-function impairment underlies chronic obstructive pulmonary disease (COPD) and predicts mortality. In the largest multi-ancestry genome-wide association meta-analysis of lung function to date, comprising 580,869 participants, we identified 1,020 independent association signals implicating 559 genes supported by & GE;2 criteria from a systematic variant-to-gene mapping framework. These genes were enriched in 29 pathways. Individual variants showed heterogeneity across ancestries, age and smoking groups, and collectively as a genetic risk score showed strong association with COPD across ancestry groups. We undertook phenome-wide association studies for selected associated variants as well as trait and pathway-specific genetic risk scores to infer possible consequences of intervening in pathways underlying lung function. We highlight new putative causal variants, genes, proteins and pathways, including those targeted by existing drugs. These findings bring us closer to understanding the mechanisms underlying lung function and COPD, and should inform functional genomics experiments and potentially future COPD therapies.Multi-ancestry genome-wide association analyses and systematic variant-to-gene mapping strategies implicate new genes and pathways influencing lung function and chronic obstructive pulmonary disease risk. Show less
Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently... Show moreBackground: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N=1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk. Show less
Previous genome-wide association studies (GWASs) of stroke - the second leading cause of death worldwide - were conducted predominantly in populations of European ancestry(1,2). Here, in cross... Show morePrevious genome-wide association studies (GWASs) of stroke - the second leading cause of death worldwide - were conducted predominantly in populations of European ancestry(1,2). Here, in cross-ancestry GWAS meta-analyses of 110,182 patients who have had a stroke (five ancestries, 33% non-European) and 1,503,898 control individuals, we identify association signals for stroke and its subtypes at 89 (61 new) independent loci: 60 in primary inverse-variance-weighted analyses and 29 in secondary meta-regression and multitrait analyses. On the basis of internal cross-ancestry validation and an independent follow-up in 89,084 additional cases of stroke (30% non-European) and 1,013,843 control individuals, 87% of the primary stroke risk loci and 60% of the secondary stroke risk loci were replicated (P < 0.05). Effect sizes were highly correlated across ancestries. Cross-ancestry fine-mapping, in silico mutagenesis analysis(3), and transcriptome-wide and proteome-wide association analyses revealed putative causal genes (such as SH3PXD2A and FURIN) and variants (such as at GRK5 and NOS3). Using a three-pronged approach(4), we provide genetic evidence for putative drug effects, highlighting F11, KLKB1, PROC, GP1BA, LAMC2 and VCAM1 as possible targets, with drugs already under investigation for stroke for F11 and PROC. A polygenic score integrating cross-ancestry and ancestry-specific stroke GWASs with vascular-risk factor GWASs (integrative polygenic scores) strongly predicted ischaemic stroke in populations of European, East Asian and African ancestry(5). Stroke genetic risk scores were predictive of ischaemic stroke independent of clinical risk factors in 52,600 clinical-trial participants with cardiometabolic disease. Our results provide insights to inform biology, reveal potential drug targets and derive genetic risk prediction tools across ancestries. Show less
A major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million... Show moreA major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million individuals from five ancestries with a wide array of functional genomic datasets to discover regulatory mechanisms underlying lipid associations. We first prioritize lipid-associated genes with expression quantitative trait locus (eQTL) colocalizations and then add chromatin interaction data to narrow the search for functional genes. Polygenic enrichment analysis across 697 annotations from a host of tissues and cell types confirms the central role of the liver in lipid levels and highlights the selective enrichment of adipose-specific chromatin marks in high-density lipoprotein cholesterol and triglycerides. Overlapping transcription factor (TF) binding sites with lipid-associated loci identifies TFs relevant in lipid biology. In addition, we present an integrative framework to prioritize causal variants at GWAS loci, producing a comprehensive list of candidate causal genes and variants with multiple layers of functional evidence. We highlight two of the prioritized genes, CREBRF and RRBP1, which show convergent evidence across functional datasets supporting their roles in lipid biology. Show less
We assembled an ancestrally diverse collection of genome-wide association studies (GWAS) of type 2 diabetes (T2D) in 180,834 affected individuals and 1,159,055 controls (48.9% non-European descent)... Show moreWe assembled an ancestrally diverse collection of genome-wide association studies (GWAS) of type 2 diabetes (T2D) in 180,834 affected individuals and 1,159,055 controls (48.9% non-European descent) through the Diabetes Meta-Analysis of Trans-Ethnic association studies (DIAMANTE) Consortium. Multi-ancestry GWAS meta-analysis identified 237 loci attaining stringent genome-wide significance (P < 5 x 10(-9)), which were delineated to 338 distinct association signals. Fine-mapping of these signals was enhanced by the increased sample size and expanded population diversity of the multi-ancestry meta-analysis, which localized 54.4% of T2D associations to a single variant with >50% posterior probability. This improved fine-mapping enabled systematic assessment of candidate causal genes and molecular mechanisms through which T2D associations are mediated, laying the foundations for functional investigations. Multi-ancestry genetic risk scores enhanced transferability of T2D prediction across diverse populations. Our study provides a step toward more effective clinical translation of T2D GWAS to improve global health for all, irrespective of genetic background.Genome-wide association and fine-mapping analyses in ancestrally diverse populations implicate candidate causal genes and mechanisms underlying type 2 diabetes. Trans-ancestry genetic risk scores enhance transferability across populations. Show less
Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions... Show moreCommon single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes1. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel2) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries. Show less
Increased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use(1). Despite advances in... Show moreIncreased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use(1). Despite advances in prevention and treatment, in particular through reducing low-density lipoprotein cholesterol levels(2), heart disease remains the leading cause of death worldwide(3). Genome-wideassociation studies (GWAS) of blood lipid levels have led to important biological and clinical insights, as well as new drug targets, for cardiovascular disease. However, most previous GWAS(4-23) have been conducted in European ancestry populations and may have missed genetic variants that contribute to lipid-level variation in other ancestry groups. These include differences in allele frequencies, effect sizes and linkage-disequilibrium patterns(24). Here we conduct a multi-ancestry, genome-wide genetic discovery meta-analysis of lipid levels in approximately 1.65 million individuals, including 350,000 of non-European ancestries. We quantify the gain in studying non-European ancestries and provide evidence to support the expansion of recruitment of additional ancestries, even with relatively small sample sizes. We find that increasing diversity rather than studying additional individuals of European ancestry results in substantial improvements in fine-mapping functional variants and portability of polygenic prediction (evaluated in approximately 295,000 individuals from 7 ancestry groupings). Modest gains in the number of discovered loci and ancestry-specific variants were also achieved. As GWAS expand emphasis beyond the identification of genes and fundamental biology towards the use of genetic variants for preventive and precision medicine(25), we anticipate that increased diversity of participants will lead to more accurate and equitable(26) application of polygenic scores in clinical practice. Show less
Ruth, K.S.; Day, F.R.; Hussain, J.; Martinez-Marchal, A.; Aiken, C.E.; Azad, A.; ... ; 23 Me Res Team 2021
Reproductive longevity is essential for fertility and influences healthy ageing in women(1,2), but insights into its underlying biological mechanisms and treatments to preserve it are limited. Here... Show moreReproductive longevity is essential for fertility and influences healthy ageing in women(1,2), but insights into its underlying biological mechanisms and treatments to preserve it are limited. Here we identify 290 genetic determinants of ovarian ageing, assessed using normal variation in age at natural menopause in approximately 200,000 women of European ancestry. These common alleles were associated with clinical extremes of age at natural menopause; women in the top 1% of genetic susceptibility have an equivalent risk of premature ovarian insufficiency to those carrying monogenic FMR1 premutations(3). The identified loci implicate a broad range of DNA damage response (DDR) processes and include loss-of-function variants in key DDR-associated genes. Integration with experimental models demonstrates that these DDR processes act across the life-course to shape the ovarian reserve and its rate of depletion. Furthermore, we demonstrate that experimental manipulation of DDR pathways highlighted by human genetics increases fertility and extends reproductive life in mice. Causal inference analyses using the identified genetic variants indicate that extending reproductive life in women improves bone health and reduces risk of type 2 diabetes, but increases the risk of hormone-sensitive cancers. These findings provide insight into the mechanisms that govern ovarian ageing, when they act, and how they might be targeted by therapeutic approaches to extend fertility and prevent disease. Show less
In many species, the offspring of related parents suffer reduced reproductive success, a phenomenon known as inbreeding depression. In humans, the importance of this effect has remained unclear,... Show moreIn many species, the offspring of related parents suffer reduced reproductive success, a phenomenon known as inbreeding depression. In humans, the importance of this effect has remained unclear, partly because reproduction between close relatives is both rare and frequently associated with confounding social factors. Here, using genomic inbreeding coefficients (F-ROH) for >1.4 million individuals, we show that F-ROH is significantly associated (p < 0.0005) with apparently deleterious changes in 32 out of 100 traits analysed. These changes are associated with runs of homozygosity (ROH), but not with common variant homozygosity, suggesting that genetic variants associated with inbreeding depression are predominantly rare. The effect on fertility is striking: F-ROH equivalent to the offspring of first cousins is associated with a 55% decrease [95% CI 44-66%] in the odds of having children. Finally, the effects of F-ROH are confirmed within full-sibling pairs, where the variation in F-ROH is independent of all environmental confounding. Show less
Brassica rapa displays enormous morphological diversity, with leafy vegetables, turnips and oil crops. Turnips (Brassica rapa subsp. rapa) represent one of the morphotypes, which form tubers... Show moreBrassica rapa displays enormous morphological diversity, with leafy vegetables, turnips and oil crops. Turnips (Brassica rapa subsp. rapa) represent one of the morphotypes, which form tubers and can be used to study the genetics underlying storage organ formation. In the present study we investigated several characteristics of an extensive turnip collection comprising 56 accessions from both Asia (mainly Japanese origin) and Europe. Population structure was calculated using data from 280 evenly distributed SNP markers over 56 turnip accessions. We studied the anatomy of turnip tubers and measured carbohydrate composition of the mature turnip tubers of a subset of the collection. The variation in 16 leaf traits, 12 tuber traits and flowering time was evaluated in five independent experiments for the entire collection. The effect of vernalization on flowering and tuber formation was also investigated. SNP marker profiling basically divided the turnip accessions into two subpopulations, with admixture, generally corresponding with geographical origin (Europe or Asia). The enlarged turnip tuber consists of both hypocotyl and root tissue, but the proportion of the two tissues differs between accessions. The ratio of sucrose to fructose and glucose differed among accessions, while generally starch content was low. The evaluated traits segregated in both subpopulations, with leaf shape, tuber colour and number of shoots per tuber explaining most variation between the two subpopulations. Vernalization resulted in reduced flowering time and smaller tubers for the Asian turnips whereas the European turnips were less affected by vernalization. Show less