Skeletal muscles are composed of different myofiber types characterized by the expression of myosin heavy chain isoforms, which can be affected by physical activity, aging, and pathological... Show moreSkeletal muscles are composed of different myofiber types characterized by the expression of myosin heavy chain isoforms, which can be affected by physical activity, aging, and pathological conditions. Here, we present a step-by-step high-throughput semi-automated approach for performing myofiber type quantification of entire human or mouse muscle tissue sections, including immunofluorescence staining, image acquisition, processing, and quantification.For complete details on the use and execution of this protocol, please refer to Abbassi-Daloii et al. (2022) Show less
Laurie, S.; Piscia, D.; Matalonga, L.; Corvo, A.; Fernandez-Callejo, M.; Garcia-Linares, C.; ... ; Beltran, S. 2022
Rare disease patients are more likely to receive a rapid molecular diagnosis nowadays thanks to the wide adoption of next-generation sequencing. However, many cases remain undiagnosed even after... Show moreRare disease patients are more likely to receive a rapid molecular diagnosis nowadays thanks to the wide adoption of next-generation sequencing. However, many cases remain undiagnosed even after exome or genome analysis, because the methods used missed the molecular cause in a known gene, or a novel causative gene could not be identified and/or confirmed. To address these challenges, the RD-Connect Genome-Phenome Analysis Platform (GPAP) facilitates the collation, discovery, sharing, and analysis of standardized genome-phenome data within a collaborative environment. Authorized clinicians and researchers submit pseudonymised phenotypic profiles encoded using the Human Phenotype Ontology, and raw genomic data which is processed through a standardized pipeline. After an optional embargo period, the data are shared with other platform users, with the objective that similar cases in the system and queries from peers may help diagnose the case. Additionally, the platform enables bidirectional discovery of similar cases in other databases from the Matchmaker Exchange network. To facilitate genome-phenome analysis and interpretation by clinical researchers, the RD-Connect GPAP provides a powerful user-friendly interface and leverages tens of information sources. As a result, the resource has already helped diagnose hundreds of rare disease patients and discover new disease causing genes. Show less
Background Patient data registries that are FAIR-Findable, Accessible, Interoperable, and Reusable for humans and computers-facilitate research across multiple resources. This is particularly... Show moreBackground Patient data registries that are FAIR-Findable, Accessible, Interoperable, and Reusable for humans and computers-facilitate research across multiple resources. This is particularly relevant to rare diseases, where data often are scarce and scattered. Specific research questions can be asked across FAIR rare disease registries and other FAIR resources without physically combining the data. Further, FAIR implies well-defined, transparent access conditions, which supports making sensitive data as open as possible and as closed as necessary. Results We successfully developed and implemented a process of making a rare disease registry for vascular anomalies FAIR from its conception-de novo. Here, we describe the five phases of this process in detail: (i) pre-FAIRification, (ii) facilitating FAIRification, (iii) data collection, (iv) generating FAIR data in real-time, and (v) using FAIR data. This includes the creation of an electronic case report form and a semantic data model of the elements to be collected (in this case: the "Set of Common Data Elements for Rare Disease Registration" released by the European Commission), and the technical implementation of automatic, real-time data FAIRification in an Electronic Data Capture system. Further, we describe how we contribute to the four facets of FAIR, and how our FAIRification process can be reused by other registries. Conclusions In conclusion, a detailed de novo FAIRification process of a registry for vascular anomalies is described. To a large extent, the process may be reused by other rare disease registries, and we envision this work to be a substantial contribution to an ecosystem of FAIR rare disease resources. Show less
Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans... Show moreTrait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.Analyses of expression profiles from whole blood of 31,684 individuals identify cis-expression quantitative trait loci (eQTL) effects for 88% of genes and trans-eQTL effects for 37% of trait-associated variants. Show less
Lin, N. van; Paliouras, G.; Vroom, E.; Hoen, P.A.C. 't; Roos, M. 2021
Background: For patients with rare diseases such as Duchenne and Becker muscular dystrophy (DMD/BMD), access to their health data is key to being able to advocate for themselves and be in control... Show moreBackground: For patients with rare diseases such as Duchenne and Becker muscular dystrophy (DMD/BMD), access to their health data is key to being able to advocate for themselves and be in control of their care. Since 2018, the DMD/BMD patient community has been committed to making DMD/BMD-related data FAIR, i.e., Findable, Accessible, Interoperable, and Reusable. On March 3, 2021, the second international meeting on FAIR data sharing for DMD/BMD was held virtually.Objective: The aim of this meeting report is to summarize the presentations and discussions of the meeting.Methods: During this meeting, the progress of FAIRification efforts since the first international meeting in 2019, new developments, stakeholder perspectives, and experiences from implementing FAIR data principles in practice were presented and discussed.Results: Over 120 attendees representing various stakeholder groups (ie, patient organizations, clinicians, clinical and academic researchers, pharmaceutical companies, regulators, and EU organizations) from 22 countries participated in the meeting. This meeting report summarizes the presentations and discussions from the meeting, provides an overview of the key lessons learned since the first meeting, and outlines the next steps.Conclusions: Patient organizations are key drivers of the FAIRification process in practice and dialogue with stakeholders is critical to success. Show less
Identifying genes involved in functional differences between similar tissues from expression profiles is challenging, because the expected differences in expression levels are small. To exemplify... Show moreIdentifying genes involved in functional differences between similar tissues from expression profiles is challenging, because the expected differences in expression levels are small. To exemplify this challenge, we studied the expression profiles of two skeletal muscles, deltoid and biceps, in healthy individuals. We provide a series of guides and recommendations for the analysis of this type of studies. These include how to account for batch effects and inter-individual differences to optimize the detection of gene signatures associated with tissue function. We provide guidance on the selection of optimal settings for constructing gene co-expression networks through parameter sweeps of settings and calculation of the overlap with an established knowledge network. Our main recommendation is to use a combination of the data-driven approaches, such as differential gene expression analysis and gene co-expression network analysis, and hypothesis-driven approaches, such as gene set connectivity analysis. Accordingly, we detected differences in metabolic gene expression between deltoid and biceps that were supported by both data- and hypothesis-driven approaches. Finally, we provide a bioinformatic framework that support the biological interpretation of expression profiles from related tissues from this combination of approaches, which is available at github.com/tabbassidaloii/AnalysisFrameworkSimilarTissues. Show less
Background DNA methylation is a key epigenetic modification in human development and disease, yet there is limited understanding of its highly coordinated regulation. Here, we identify 818 genes... Show moreBackground DNA methylation is a key epigenetic modification in human development and disease, yet there is limited understanding of its highly coordinated regulation. Here, we identify 818 genes that affect DNA methylation patterns in blood using large-scale population genomics data. Results By employing genetic instruments as causal anchors, we establish directed associations between gene expression and distant DNA methylation levels, while ensuring specificity of the associations by correcting for linkage disequilibrium and pleiotropy among neighboring genes. The identified genes are enriched for transcription factors, of which many consistently increased or decreased DNA methylation levels at multiple CpG sites. In addition, we show that a substantial number of transcription factors affected DNA methylation at their experimentally determined binding sites. We also observe genes encoding proteins with heterogenous functions that have widespread effects on DNA methylation, e.g.,NFKBIE,CDCA7(L), andNLRC5, and for several examples, we suggest plausible mechanisms underlying their effect on DNA methylation. Conclusion We report hundreds of genes that affect DNA methylation and provide key insights in the principles underlying epigenetic regulation. Show less
Fuchs, K.J.; Honders, M.W.; Meijden, E.D. van der; Adriaans, A.E.; Lee, D.I. van der; Pont, M.J.; ... ; Griffioen, M. 2020
Patients undergoing allogeneic stem cell transplantation as treatment for hematological diseases face the risk of Graft-versus-Host Disease as well as relapse. Graft-versus-Host Disease and the... Show morePatients undergoing allogeneic stem cell transplantation as treatment for hematological diseases face the risk of Graft-versus-Host Disease as well as relapse. Graft-versus-Host Disease and the favorable Graft-versus-Leukemia effect are mediated by donor T cells recognizing polymorphic peptides, which are presented on the cell surface by HLA molecules and result from single nucleotide polymorphism alleles that are disparate between patient and donor. Identification of polymorphic HLA-binding peptides, designated minor histocompatibility antigens, has been a laborious procedure, and the number and scope for broad clinical use of these antigens therefore remain limited. Here, we present an optimized whole genome association approach for discovery of HLA class I minor histocompatibility antigens. T cell clones isolated from patients who responded to donor lymphocyte infusions after HLA-matched allogeneic stem cell transplantation were tested against a panel of 191 EBV-transformed B cells, which have been sequenced by the 1000 Genomes Project and selected for expression of seven common HLA class I alleles (HLA-A*01:01, A*02:01, A*03:01, B*07:02, B*08:01, C*07:01, and C*07:02). By including all polymorphisms with minor allele frequencies above 0.01, we demonstrated that the new approach allows direct discovery of minor histocompatibility antigens as exemplified by seven new antigens in eight different HLA class I alleles including one antigen in HLA-A*24:02 and HLA-A*23:01, for which the method has not been originally designed. Our new whole genome association strategy is expected to rapidly augment the repertoire of HLA class I-restricted minor histocompatibility antigens that will become available for donor selection and clinical use to predict, follow or manipulate Graft-versus-Leukemia effect and Graft-versus-Host Disease after allogeneic stem cell transplantation. Show less
Acute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally... Show moreAcute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally diverse and currently detected by different diagnostic assays. This study sought to establish whole transcriptome RNA sequencing as single, comprehensive, and flexible platform for AML diagnostics. We developed HAMLET (Human AML Expedited Transcriptomics) as bioinformatics pipeline for simultaneous detection of fusion genes, small variants, tandem duplications, and gene expression with all information assembled in an annotated, user-friendly output file. Whole transcriptome RNA sequencing was performed on 100 AML cases and HAMLET results were validated by reference assays and targeted resequencing. The data showed that HAMLET accurately detected all fusion genes and overexpression of EVI1 irrespective of 3q26 aberrations. In addition, small variants in 13 genes that are often mutated in AML were called with 99.2% sensitivity and 100% specificity, and tandem duplications in FLT3 and KMT2A were detected by a novel algorithm based on soft-clipped reads with 100% sensitivity and 97.1% specificity. In conclusion, HAMLET has the potential to provide accurate comprehensive diagnostic information relevant for AML classification, risk assessment and targeted therapy on a single technology platform. Show less
Insights into individual differences in gene expression and its heritability (h(2)) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52... Show moreInsights into individual differences in gene expression and its heritability (h(2)) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52,844 genes measured in whole blood in the largest twin RNA-Seq sample to date (1497 individuals including 459 monozygotic twin pairs and 150 dizygotic twin pairs) from classical twin modeling and identity-by-state-based approaches. We estimated for each gene h(total)(2), composed of cis-heritability (h(cis)(2), the variance explained by single nucleotide polymorphisms in the cis-window of the gene), and trans-heritability (h(res)(2), the residual variance explained by all other genome-wide variants). Mean h(total)(2) was 0.26, which was significantly higher than heritability estimates earlier found in a microarray-based study using largely overlapping (>60%) RNA samples (mean h(2) = 0.14, p = 6.15 x 10(-258)). Mean h(cis)(2) was 0.06 and strongly correlated with beta of the top cis expression quantitative loci (eQTL, rho = 0.76, p < 10(-308)) and with estimates from earlier RNA-Seq-based studies. Mean h(res)(2) was 0.20 and correlated with the beta of the corresponding trans-eQTL (rho = 0.04, p < 1.89 x 10(-3)) and was significantly higher for genes involved in cytokine-cytokine interactions (p = 4.22 x 10(-15)), many other immune system pathways, and genes identified in genome-wide association studies for various traits including behavioral disorders and cancer. This study provides a thorough characterization of cis- and trans-h(2) estimates of gene expression, which is of value for interpretation of GWAS and gene expression studies. Show less
Background: Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug... Show moreBackground: Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug approved for ADPKD treatment, results in serious side-effects, warranting the need for novel drugs.Methods: In this study, we applied RNA-sequencing of Pkd1cko mice at different disease stages, and with/without drug treatment to identify genes involved in ADPKD progression that were further used to identify novel drug candidates for ADPKD. We followed an integrative computational approach using a combination of gene expression profiling, bioinformatics and cheminformatics data.Findings: We identified 1162 genes that had a normalized expression after treating the mice with drugs proven effective in preclinical models. Intersecting these genes with target affinity profiles for clinically-approved drugs in ChEMBL, resulted in the identification of 116 drugs targeting 29 proteins, of which several are previously linked to Polycystic Kidney Disease such as Rosiglitazone. Further testing the efficacy of six candidate drugs for inhibition of cyst swelling using a human 3D-cyst assay, revealed that three of the six had cyst-growth reducing effects with limited toxicity.Interpretation: Our data further establishes drug repurposing as a robust drug discovery method, with three promising drug candidates identified for ADPKD treatment (Meclofenamic Acid, Gamolenic Acid and Birinapant). Our strategy that combines multiple-omics data, can be extended for ADPKD and other diseases in the future. (C) 2019 The Authors. Published by Elsevier B.V. Show less
Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug approved for... Show moreAutosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug approved for ADPKD treatment, results in serious side-effects, warranting the need for novel drugs.\nIn this study, we applied RNA-sequencing of Pkd1cko mice at different disease stages, and with/without drug treatment to identify genes involved in ADPKD progression that were further used to identify novel drug candidates for ADPKD. We followed an integrative computational approach using a combination of gene expression profiling, bioinformatics and cheminformatics data.\nWe identified 1162 genes that had a normalized expression after treating the mice with drugs proven effective in preclinical models. Intersecting these genes with target affinity profiles for clinically-approved drugs in ChEMBL, resulted in the identification of 116 drugs targeting 29 proteins, of which several are previously linked to Polycystic Kidney Disease such as Rosiglitazone. Further testing the efficacy of six candidate drugs for inhibition of cyst swelling using a human 3D-cyst assay, revealed that three of the six had cyst-growth reducing effects with limited toxicity.\nOur data further establishes drug repurposing as a robust drug discovery method, with three promising drug candidates identified for ADPKD treatment (Meclofenamic Acid, Gamolenic Acid and Birinapant). Our strategy that combines multiple-omics data, can be extended for ADPKD and other diseases in the future.\nEuropean Union's Seventh Framework Program, Dutch Technology Foundation Stichting Technische Wetenschappen and the Dutch Kidney Foundation. Show less
Rooij, J. van; Mandaviya, P.R.; Claringbould, A.; Felix, J.F.; Dongen, J. van; Jansen, R.; ... ; BIOS Consortium 2019
BackgroundA large number of analysis strategies are available for DNA methylation (DNAm) array and RNA-seq datasets, but it is unclear which strategies are best to use. We compare commonly used... Show moreBackgroundA large number of analysis strategies are available for DNA methylation (DNAm) array and RNA-seq datasets, but it is unclear which strategies are best to use. We compare commonly used strategies and report how they influence results in large cohort studies.ResultsWe tested the associations of DNAm and RNA expression with age, BMI, and smoking in four different cohorts (n =similar to 2900). By comparing strategies against the base model on the number and percentage of replicated CpGs for DNAm analyses or genes for RNA-seq analyses in a leave-one-out cohort replication approach, we find the choice of the normalization method and statistical test does not strongly influence the results for DNAm array data. However, adjusting for cell counts or hidden confounders substantially decreases the number of replicated CpGs for age and increases the number of replicated CpGs for BMI and smoking. For RNA-seq data, the choice of the normalization method, gene expression inclusion threshold, and statistical test does not strongly influence the results. Including five principal components or excluding correction of technical covariates or cell counts decreases the number of replicated genes.ConclusionsResults were not influenced by the normalization method or statistical test. However, the correction method for cell counts, technical covariates, principal components, and/or hidden confounders does influence the results. Show less
Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources,... Show moreCompounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials. Show less