We evaluate the shared genetic regulation of mRNA molecules, proteins and metabolites derived from whole blood from 3029 human donors. We find abundant allelic heterogeneity, where multiple... Show moreWe evaluate the shared genetic regulation of mRNA molecules, proteins and metabolites derived from whole blood from 3029 human donors. We find abundant allelic heterogeneity, where multiple variants regulate a particular molecular phenotype, and pleiotropy, where a single variant associates with multiple molecular phenotypes over multiple genomic regions. The highest proportion of share genetic regulation is detected between gene expression and proteins (66.6%), with a further median shared genetic associations across 49 different tissues of 78.3% and 62.4% between plasma proteins and gene expression. We represent the genetic and molecular associations in networks including 2828 known GWAS variants, showing that GWAS variants are more often connected to gene expression in trans than other molecular phenotypes in the network. Our work provides a roadmap to understanding molecular networks and deriving the underlying mechanism of action of GWAS variants using different molecular phenotypes in an accessible tissue. Show less
The application of multiple omics technologies in biomedical cohorts has the potential to reveal patient-level disease characteristics and individualized response to treatment. However, the scale... Show moreThe application of multiple omics technologies in biomedical cohorts has the potential to reveal patient-level disease characteristics and individualized response to treatment. However, the scale and heterogeneous nature of multi-modal data makes integration and inference a non-trivial task. We developed a deep-learning-based framework, multi-omics variational autoencoders (MOVE), to integrate such data and applied it to a cohort of 789 people with newly diagnosed type 2 diabetes with deep multi-omics phenotyping from the DIRECT consortium. Using in silico perturbations, we identified drug-omics associations across the multi-modal datasets for the 20 most prevalent drugs given to people with type 2 diabetes with substantially higher sensitivity than univariate statistical tests. From these, we among others, identified novel associations between metformin and the gut microbiota as well as opposite molecular responses for the two statins, simvastatin and atorvastatin. We used the associations to quantify drug-drug similarities, assess the degree of polypharmacy and conclude that drug effects are distributed across the multi-omics modalities. Show less
Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently... Show moreBackground: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N=1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk. Show less
A major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million... Show moreA major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million individuals from five ancestries with a wide array of functional genomic datasets to discover regulatory mechanisms underlying lipid associations. We first prioritize lipid-associated genes with expression quantitative trait locus (eQTL) colocalizations and then add chromatin interaction data to narrow the search for functional genes. Polygenic enrichment analysis across 697 annotations from a host of tissues and cell types confirms the central role of the liver in lipid levels and highlights the selective enrichment of adipose-specific chromatin marks in high-density lipoprotein cholesterol and triglycerides. Overlapping transcription factor (TF) binding sites with lipid-associated loci identifies TFs relevant in lipid biology. In addition, we present an integrative framework to prioritize causal variants at GWAS loci, producing a comprehensive list of candidate causal genes and variants with multiple layers of functional evidence. We highlight two of the prioritized genes, CREBRF and RRBP1, which show convergent evidence across functional datasets supporting their roles in lipid biology. Show less
The presentation and underlying pathophysiology of type 2 diabetes (T2D) is complex and heterogeneous. Recent studies attempted to stratify T2D into distinct subgroups using data-driven approaches,... Show moreThe presentation and underlying pathophysiology of type 2 diabetes (T2D) is complex and heterogeneous. Recent studies attempted to stratify T2D into distinct subgroups using data-driven approaches, but their clinical utility may be limited if categorical representations of complex phenotypes are suboptimal. We apply a soft-clustering (archetype) method to characterize newly diagnosed T2D based on 32 clinical variables. We assign quantitative clustering scores for individuals and investigate the associations with glycemic deterioration, genetic risk scores, circulating omics biomarkers, and phenotypic stability over 36 months. Four archetype profiles represent dysfunction patterns across combinations of T2D etiological processes and correlate with multiple circulating biomarkers. One archetype associated with obesity, insulin resistance, dyslipidemia, and impaired 1 beta cell glucose sensitivity corresponds with the fastest disease progression and highest demand for anti-diabetic treatment. We demonstrate that clinical heterogeneity in T2D can be mapped to heterogeneity in individual etiological processes, providing a potential route to personalized treatments. Show less
Increased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use(1). Despite advances in... Show moreIncreased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use(1). Despite advances in prevention and treatment, in particular through reducing low-density lipoprotein cholesterol levels(2), heart disease remains the leading cause of death worldwide(3). Genome-wideassociation studies (GWAS) of blood lipid levels have led to important biological and clinical insights, as well as new drug targets, for cardiovascular disease. However, most previous GWAS(4-23) have been conducted in European ancestry populations and may have missed genetic variants that contribute to lipid-level variation in other ancestry groups. These include differences in allele frequencies, effect sizes and linkage-disequilibrium patterns(24). Here we conduct a multi-ancestry, genome-wide genetic discovery meta-analysis of lipid levels in approximately 1.65 million individuals, including 350,000 of non-European ancestries. We quantify the gain in studying non-European ancestries and provide evidence to support the expansion of recruitment of additional ancestries, even with relatively small sample sizes. We find that increasing diversity rather than studying additional individuals of European ancestry results in substantial improvements in fine-mapping functional variants and portability of polygenic prediction (evaluated in approximately 295,000 individuals from 7 ancestry groupings). Modest gains in the number of discovered loci and ancestry-specific variants were also achieved. As GWAS expand emphasis beyond the identification of genes and fundamental biology towards the use of genetic variants for preventive and precision medicine(25), we anticipate that increased diversity of participants will lead to more accurate and equitable(26) application of polygenic scores in clinical practice. Show less
Glycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here... Show moreGlycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here we aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (30% non-European ancestry) for whom fasting glucose, 2-h glucose after an oral glucose challenge, glycated hemoglobin and fasting insulin data were available. Trans-ancestry and single-ancestry meta-analyses identified 242 loci (99 novel; P < 5 x 10(-8)), 80% of which had no significant evidence of between-ancestry heterogeneity. Analyses restricted to individuals of European ancestry with equivalent sample size would have led to 24 fewer new loci. Compared with single-ancestry analyses, equivalent-sized trans-ancestry fine-mapping reduced the number of estimated variants in 99% credible sets by a median of 37.5%. Genomic-feature, gene-expression and gene-set analyses revealed distinct biological signatures for each trait, highlighting different underlying biological pathways. Our results increase our understanding of diabetes pathophysiology by using trans-ancestry studies for improved power and resolution.A trans-ancestry meta-analysis of GWAS of glycemic traits in up to 281,416 individuals identifies 99 novel loci, of which one quarter was found due to the multi-ancestry approach, which also improves fine-mapping of credible variant sets. Show less
OBJECTIVEWe investigated the processes underlying glycemic deterioration in type 2 diabetes (T2D).RESEARCH DESIGN AND METHODSA total of 732 recently diagnosed patients with T2D from the Innovative... Show moreOBJECTIVEWe investigated the processes underlying glycemic deterioration in type 2 diabetes (T2D).RESEARCH DESIGN AND METHODSA total of 732 recently diagnosed patients with T2D from the Innovative Medicines Initiative Diabetes Research on Patient Stratification (IMI DIRECT) study were extensively phenotyped over 3 years, including measures of insulin sensitivity (OGIS), beta-cell glucose sensitivity (GS), and insulin clearance (CLIm) from mixed meal tests, liver enzymes, lipid profiles, and baseline regional fat from MRI. The associations between the longitudinal metabolic patterns and HbA(1c) deterioration, adjusted for changes in BMI and in diabetes medications, were assessed via stepwise multivariable linear and logistic regression.RESULTSFaster HbA(1c) progression was independently associated with faster deterioration of OGIS and GS and increasing CLIm; visceral or liver fat, HDL-cholesterol, and triglycerides had further independent, though weaker, roles (R-2 = 0.38). A subgroup of patients with a markedly higher progression rate (fast progressors) was clearly distinguishable considering these variables only (discrimination capacity from area under the receiver operating characteristic = 0.94). The proportion of fast progressors was reduced from 56% to 8-10% in subgroups in which only one trait among OGIS, GS, and CLIm was relatively stable (odds ratios 0.07-0.09). T2D polygenic risk score and baseline pancreatic fat, glucagon-like peptide 1, glucagon, diet, and physical activity did not show an independent role.CONCLUSIONSDeteriorating insulin sensitivity and beta-cell function, increasing insulin clearance, high visceral or liver fat, and worsening of the lipid profile are the crucial factors mediating glycemic deterioration of patients with T2D in the initial phase of the disease. Stabilization of a single trait among insulin sensitivity, beta-cell function, and insulin clearance may be relevant to prevent progression. Show less
Background The rising prevalence of type 2 diabetes (T2D) poses a major global challenge. It remains unresolved to what extent transcriptomic signatures of metabolic dysregulation and T2D can be... Show moreBackground The rising prevalence of type 2 diabetes (T2D) poses a major global challenge. It remains unresolved to what extent transcriptomic signatures of metabolic dysregulation and T2D can be observed in easily accessible tissues such as blood. Additionally, large-scale human studies are required to further our understanding of the putative inflammatory component of insulin resistance and T2D. Here we used transcriptomics data from individuals with (n = 789) and without (n = 2127) T2D from the IMI-DIRECT cohorts to describe the co-expression structure of whole blood that mainly reflects processes and cell types of the immune system, and how it relates to metabolically relevant clinical traits and T2D. Methods Clusters of co-expressed genes were identified in the non-diabetic IMI-DIRECT cohort and evaluated with regard to stability, as well as preservation and rewiring in the cohort of individuals with T2D. We performed functional and immune cell signature enrichment analyses, and a genome-wide association study to describe the genetic regulation of the modules. Phenotypic and trans-omics associations of the transcriptomic modules were investigated across both IMI-DIRECT cohorts. Results We identified 55 whole blood co-expression modules, some of which clustered in larger super-modules. We identified a large number of associations between these transcriptomic modules and measures of insulin action and glucose tolerance. Some of the metabolically linked modules reflect neutrophil-lymphocyte ratio in blood while others are independent of white blood cell estimates, including a module of genes encoding neutrophil granule proteins with antibacterial properties for which the strongest associations with clinical traits and T2D status were observed. Through the integration of genetic and multi-omics data, we provide a holistic view of the regulation and molecular context of whole blood transcriptomic modules. We furthermore identified an overlap between genetic signals for T2D and co-expression modules involved in type II interferon signaling. Conclusions Our results offer a large-scale map of whole blood transcriptomic modules in the context of metabolic disease and point to novel biological candidates for future studies related to T2D. Show less
Atabaki-Pasdar, N.; Ohlsson, M.; Vinuela, A.; Frau, F.; Pomares-Millan, H.; Haid, M.; ... ; Franks, P.W. 2020
BackgroundNon-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is... Show moreBackgroundNon-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.Methods and findingsWe utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n= 795) or at high risk of developing the disease (n= 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (<5% or >= 5%) available for 1,514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and random forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operating characteristic area under the curve (ROCAUC) of 0.84 (95% CI 0.82, 0.86;p <0.001), which compared with a ROCAUC of 0.82 (95% CI 0.81, 0.83;p <0.001) for a model including 9 clinically accessible variables. The IMI DIRECT prediction models outperformed existing noninvasive NAFLD prediction tools. One limitation is that these analyses were performed in adults of European ancestry residing in northern Europe, and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that differ from those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome of liver fat quantity (<5% or >= 5%) rather than a continuous one.ConclusionsIn this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see:) and made it available to the community. Show less