We evaluate the shared genetic regulation of mRNA molecules, proteins and metabolites derived from whole blood from 3029 human donors. We find abundant allelic heterogeneity, where multiple... Show moreWe evaluate the shared genetic regulation of mRNA molecules, proteins and metabolites derived from whole blood from 3029 human donors. We find abundant allelic heterogeneity, where multiple variants regulate a particular molecular phenotype, and pleiotropy, where a single variant associates with multiple molecular phenotypes over multiple genomic regions. The highest proportion of share genetic regulation is detected between gene expression and proteins (66.6%), with a further median shared genetic associations across 49 different tissues of 78.3% and 62.4% between plasma proteins and gene expression. We represent the genetic and molecular associations in networks including 2828 known GWAS variants, showing that GWAS variants are more often connected to gene expression in trans than other molecular phenotypes in the network. Our work provides a roadmap to understanding molecular networks and deriving the underlying mechanism of action of GWAS variants using different molecular phenotypes in an accessible tissue. Show less
The application of multiple omics technologies in biomedical cohorts has the potential to reveal patient-level disease characteristics and individualized response to treatment. However, the scale... Show moreThe application of multiple omics technologies in biomedical cohorts has the potential to reveal patient-level disease characteristics and individualized response to treatment. However, the scale and heterogeneous nature of multi-modal data makes integration and inference a non-trivial task. We developed a deep-learning-based framework, multi-omics variational autoencoders (MOVE), to integrate such data and applied it to a cohort of 789 people with newly diagnosed type 2 diabetes with deep multi-omics phenotyping from the DIRECT consortium. Using in silico perturbations, we identified drug-omics associations across the multi-modal datasets for the 20 most prevalent drugs given to people with type 2 diabetes with substantially higher sensitivity than univariate statistical tests. From these, we among others, identified novel associations between metformin and the gut microbiota as well as opposite molecular responses for the two statins, simvastatin and atorvastatin. We used the associations to quantify drug-drug similarities, assess the degree of polypharmacy and conclude that drug effects are distributed across the multi-omics modalities. Show less
The presentation and underlying pathophysiology of type 2 diabetes (T2D) is complex and heterogeneous. Recent studies attempted to stratify T2D into distinct subgroups using data-driven approaches,... Show moreThe presentation and underlying pathophysiology of type 2 diabetes (T2D) is complex and heterogeneous. Recent studies attempted to stratify T2D into distinct subgroups using data-driven approaches, but their clinical utility may be limited if categorical representations of complex phenotypes are suboptimal. We apply a soft-clustering (archetype) method to characterize newly diagnosed T2D based on 32 clinical variables. We assign quantitative clustering scores for individuals and investigate the associations with glycemic deterioration, genetic risk scores, circulating omics biomarkers, and phenotypic stability over 36 months. Four archetype profiles represent dysfunction patterns across combinations of T2D etiological processes and correlate with multiple circulating biomarkers. One archetype associated with obesity, insulin resistance, dyslipidemia, and impaired 1 beta cell glucose sensitivity corresponds with the fastest disease progression and highest demand for anti-diabetic treatment. We demonstrate that clinical heterogeneity in T2D can be mapped to heterogeneity in individual etiological processes, providing a potential route to personalized treatments. Show less
Ghorasaini, M.; Mohammed, Y.; Adamski, J.; Bettcher, L.; Bowden, J.A.; Cabruja, M.; ... ; Giera, M. 2021
Modern biomarker and translational research as well as personalized health care studies rely heavily on powerful omics' technologies, including metabolomics and lipidomics. However, to translate... Show moreModern biomarker and translational research as well as personalized health care studies rely heavily on powerful omics' technologies, including metabolomics and lipidomics. However, to translate metabolomics and lipidomics discoveries into a high-throughput clinical setting, standardization is of utmost importance. Here, we compared and benchmarked a quantitative lipidomics platform. The employed Lipidyzer platform is based on lipid class separation by means of differential mobility spectrometry with subsequent multiple reaction monitoring. Quantitation is achieved by the use of 54 deuterated internal standards and an automated informatics approach. We investigated the platform performance across nine laboratories using NIST SRM 1950-Metabolites in Frozen Human Plasma, and three NIST Candidate Reference Materials 8231-Frozen Human Plasma Suite for Metabolomics (high triglyceride, diabetic, and African-American plasma). In addition, we comparatively analyzed 59 plasma samples from individuals with familial hypercholesterolemia from a clinical cohort study. We provide evidence that the more practical methyl-tert-butyl ether extraction outperforms the classic Bligh and Dyer approach and compare our results with two previously published ring trials. In summary, we present standardized lipidomics protocols, allowing for the highly reproducible analysis of several hundred human plasma lipids, and present detailed molecular information for potentially disease relevant and ethnicity-related materials. Show less
Background The rising prevalence of type 2 diabetes (T2D) poses a major global challenge. It remains unresolved to what extent transcriptomic signatures of metabolic dysregulation and T2D can be... Show moreBackground The rising prevalence of type 2 diabetes (T2D) poses a major global challenge. It remains unresolved to what extent transcriptomic signatures of metabolic dysregulation and T2D can be observed in easily accessible tissues such as blood. Additionally, large-scale human studies are required to further our understanding of the putative inflammatory component of insulin resistance and T2D. Here we used transcriptomics data from individuals with (n = 789) and without (n = 2127) T2D from the IMI-DIRECT cohorts to describe the co-expression structure of whole blood that mainly reflects processes and cell types of the immune system, and how it relates to metabolically relevant clinical traits and T2D. Methods Clusters of co-expressed genes were identified in the non-diabetic IMI-DIRECT cohort and evaluated with regard to stability, as well as preservation and rewiring in the cohort of individuals with T2D. We performed functional and immune cell signature enrichment analyses, and a genome-wide association study to describe the genetic regulation of the modules. Phenotypic and trans-omics associations of the transcriptomic modules were investigated across both IMI-DIRECT cohorts. Results We identified 55 whole blood co-expression modules, some of which clustered in larger super-modules. We identified a large number of associations between these transcriptomic modules and measures of insulin action and glucose tolerance. Some of the metabolically linked modules reflect neutrophil-lymphocyte ratio in blood while others are independent of white blood cell estimates, including a module of genes encoding neutrophil granule proteins with antibacterial properties for which the strongest associations with clinical traits and T2D status were observed. Through the integration of genetic and multi-omics data, we provide a holistic view of the regulation and molecular context of whole blood transcriptomic modules. We furthermore identified an overlap between genetic signals for T2D and co-expression modules involved in type II interferon signaling. Conclusions Our results offer a large-scale map of whole blood transcriptomic modules in the context of metabolic disease and point to novel biological candidates for future studies related to T2D. Show less
Atabaki-Pasdar, N.; Ohlsson, M.; Vinuela, A.; Frau, F.; Pomares-Millan, H.; Haid, M.; ... ; Franks, P.W. 2020
BackgroundNon-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is... Show moreBackgroundNon-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.Methods and findingsWe utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n= 795) or at high risk of developing the disease (n= 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (<5% or >= 5%) available for 1,514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and random forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operating characteristic area under the curve (ROCAUC) of 0.84 (95% CI 0.82, 0.86;p <0.001), which compared with a ROCAUC of 0.82 (95% CI 0.81, 0.83;p <0.001) for a model including 9 clinically accessible variables. The IMI DIRECT prediction models outperformed existing noninvasive NAFLD prediction tools. One limitation is that these analyses were performed in adults of European ancestry residing in northern Europe, and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that differ from those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome of liver fat quantity (<5% or >= 5%) rather than a continuous one.ConclusionsIn this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see:) and made it available to the community. Show less
Quell, J.D.; Romisch-Margl, W.; Haid, M.; Krumsiek, J.; Skurk, T.; Halama, A.; ... ; Kastenmuller, G. 2019
Kit-based assays, such as AbsoluteIDQ(TM) p150, are widely used in large cohort studies and provide a standardized method to quantify blood concentrations of phosphatidylcholines (PCs). Many... Show moreKit-based assays, such as AbsoluteIDQ(TM) p150, are widely used in large cohort studies and provide a standardized method to quantify blood concentrations of phosphatidylcholines (PCs). Many disease-relevant associations of PCs were reported using this method. However, their interpretation is hampered by lack of functionally-relevant information on the detailed fatty acid side-chain compositions as only the total number of carbon atoms and double bonds is identified by the kit. To enable more substantiated interpretations, we characterized these PC sums using the side-chain resolving Lipidyzer(TM) platform, analyzing 223 samples in parallel to the AbsoluteIDQ(TM). Combining these datasets, we estimated the quantitative composition of PC sums and subsequently tested their replication in an independent cohort. We identified major constituents of 28 PC sums, revealing also various unexpected compositions. As an example, PC 16:0_22:5 accounted for more than 50% of the PC sum with in total 38 carbon atoms and 5 double bonds (PC aa 38:5). For 13 PC sums, we found relatively high abundances of odd-chain fatty acids. In conclusion, our study provides insights in PC compositions in human plasma, facilitating interpretation of existing epidemiological data sets and potentially enabling imputation of PC compositions for future meta-analyses of lipidomics data. Show less
Molnos, S.; Wahl, S.; Haid, M.; Eekhoff, E.M.W.; Pool, R.; Floegel, A.; ... ; Hart, L.M. 't 2018