Background Patient experience surveys often include free-text responses. Analysis of these responses is time-consuming and often underutilized. This study examined whether Natural Language... Show moreBackground Patient experience surveys often include free-text responses. Analysis of these responses is time-consuming and often underutilized. This study examined whether Natural Language Processing (NLP) techniques could provide a data-driven, hospital-independent solution to indicate points for quality improvement. Methods This retrospective study used routinely collected patient experience data from two hospitals. A data-driven NLP approach was used. Free-text responses were categorized into topics, subtopics (i.e. n-grams) and labelled with a sentiment score. The indicator 'impact', combining sentiment and frequency, was calculated to reveal topics to improve, monitor or celebrate. The topic modelling architecture was tested on data from a second hospital to examine whether the architecture is transferable to another hospital. Results A total of 38,664 survey responses from the first hospital resulted in 127 topics and 294 n-grams. The indicator 'impact' revealed n-grams to celebrate (15.3%), improve (8.8%), and monitor (16.7%). For hospital 2, a similar percentage of free-text responses could be labelled with a topic and n-grams. Between-hospitals, most topics (69.7%) were similar, but 32.2% of topics for hospital 1 and 29.0% of topics for hospital 2 were unique. Conclusions In both hospitals, NLP techniques could be used to categorize patient experience free-text responses into topics, sentiment labels and to define priorities for improvement. The model's architecture was shown to be hospital-specific as it was able to discover new topics for the second hospital. These methods should be considered for future patient experience analyses to make better use of this valuable source of information. Show less
Background: Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug... Show moreBackground: Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug approved for ADPKD treatment, results in serious side-effects, warranting the need for novel drugs.Methods: In this study, we applied RNA-sequencing of Pkd1cko mice at different disease stages, and with/without drug treatment to identify genes involved in ADPKD progression that were further used to identify novel drug candidates for ADPKD. We followed an integrative computational approach using a combination of gene expression profiling, bioinformatics and cheminformatics data.Findings: We identified 1162 genes that had a normalized expression after treating the mice with drugs proven effective in preclinical models. Intersecting these genes with target affinity profiles for clinically-approved drugs in ChEMBL, resulted in the identification of 116 drugs targeting 29 proteins, of which several are previously linked to Polycystic Kidney Disease such as Rosiglitazone. Further testing the efficacy of six candidate drugs for inhibition of cyst swelling using a human 3D-cyst assay, revealed that three of the six had cyst-growth reducing effects with limited toxicity.Interpretation: Our data further establishes drug repurposing as a robust drug discovery method, with three promising drug candidates identified for ADPKD treatment (Meclofenamic Acid, Gamolenic Acid and Birinapant). Our strategy that combines multiple-omics data, can be extended for ADPKD and other diseases in the future. (C) 2019 The Authors. Published by Elsevier B.V. Show less
Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug approved for... Show moreAutosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common causes of end-stage renal failure, caused by mutations in PKD1 or PKD2 genes. Tolvaptan, the only drug approved for ADPKD treatment, results in serious side-effects, warranting the need for novel drugs.\nIn this study, we applied RNA-sequencing of Pkd1cko mice at different disease stages, and with/without drug treatment to identify genes involved in ADPKD progression that were further used to identify novel drug candidates for ADPKD. We followed an integrative computational approach using a combination of gene expression profiling, bioinformatics and cheminformatics data.\nWe identified 1162 genes that had a normalized expression after treating the mice with drugs proven effective in preclinical models. Intersecting these genes with target affinity profiles for clinically-approved drugs in ChEMBL, resulted in the identification of 116 drugs targeting 29 proteins, of which several are previously linked to Polycystic Kidney Disease such as Rosiglitazone. Further testing the efficacy of six candidate drugs for inhibition of cyst swelling using a human 3D-cyst assay, revealed that three of the six had cyst-growth reducing effects with limited toxicity.\nOur data further establishes drug repurposing as a robust drug discovery method, with three promising drug candidates identified for ADPKD treatment (Meclofenamic Acid, Gamolenic Acid and Birinapant). Our strategy that combines multiple-omics data, can be extended for ADPKD and other diseases in the future.\nEuropean Union's Seventh Framework Program, Dutch Technology Foundation Stichting Technische Wetenschappen and the Dutch Kidney Foundation. Show less
Southall, N.T.; Natarajan, M.; Lau, L.P.L.; Jonker, A.H.; Deprez, B.; Guilliams, T.; ... ; Thompson, R. 2019
The number of available therapies for rare diseases remains low, as fewer than 6% of rare diseases have an approved treatment option. The International Rare Diseases Research Consortium (IRDiRC)... Show moreThe number of available therapies for rare diseases remains low, as fewer than 6% of rare diseases have an approved treatment option. The International Rare Diseases Research Consortium (IRDiRC) set up the multi-stakeholder Data Mining and Repurposing (DMR) Task Force to examine the potential of applying biomedical data mining strategies to identify new opportunities to use existing pharmaceutical compounds in new ways and to accelerate the pace of drug development for rare disease patients. In reviewing past successes of data mining for drug repurposing, and planning for future biomedical research capacity, the DMR Task Force identified four strategic infrastructure investment areas to focus on in order to accelerate rare disease research productivity and drug development: (1) improving the capture and sharing of self-reported patient data, (2) better integration of existing research data, (3) increasing experimental testing capacity, and (4) sharing of rare disease research and development expertise. Additionally, the DMR Task Force also recommended a number of strategies to increase data mining and repurposing opportunities for rare diseases research as well as the development of individualized and precision medicine strategies. Show less
Malas, T.B.; Vlietstra, W.J.; Kudrin, R.; Starikov, S.; Charrout, M.; Roos, M.; ... ; Hettne, K.M. 2019
Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources,... Show moreCompounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials. Show less
Malas, T.B.; Vlietstra, W.J.; Kudrin, R.; Starikov, S.; Charrout, M.; Roos, M.; ... ; Hettne, K.M. 2019
Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources,... Show moreCompounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials. Show less
Medication for nonalcoholic fatty liver disease (NAFLD) is an unmet need. Glucocorticoid (GC) stress hormones drive fat metabolism in the liver, but both full blockade and full stimulation of GC... Show moreMedication for nonalcoholic fatty liver disease (NAFLD) is an unmet need. Glucocorticoid (GC) stress hormones drive fat metabolism in the liver, but both full blockade and full stimulation of GC signaling aggravate NAFLD pathology. We investigated the efficacy of selective glucocorticoid receptor (GR) modulator CORT118335, which recapitulates only a subset of GC actions, in reducing liver lipid accumulation in mice. Male C57BL/6J mice received a low-fat diet or high-fat diet mixed with vehicle or CORT118335. Livers were analyzed histologically and for genome-wide mRNA expression. Functionally, hepatic long-chain fatty acid (LCFA) composition was determined by gas chromatography. We determined very-low-density lipoprotein (VLDL) production by treatment with a lipoprotein lipase inhibitor after which blood was collected to isolate radiolabeled VLDL particles and apoB proteins. CORT118335 strongly prevented and reversed hepatic lipid accumulation. Liver transcriptome analysis showed increased expression of GR target genes involved in VLDL production. Accordingly, CORT118335 led to increased lipidation of VLDL particles, mimicking physiological GC action. Independent pathway analysis revealed that CORT118335 lacked induction of GC-responsive genes involved in cholesterol synthesis and LCFA uptake, which was indeed reflected in unaltered hepatic LCFA uptake in vivo. Our data thus reveal that the robust hepatic lipid-lowering effect of CORT118335 is due to a unique combination of GR-dependent stimulation of lipid (VLDL) efflux from the liver, with a lack of stimulation of GR-dependent hepatic fatty acid uptake. Our findings firmly demonstrate the potential use of CORT118335 in the treatment of NAFLD and underscore the potential of selective GR modulation in metabolic disease. Show less
Background: Spinocerebellar ataxia type 3 (SCA3) is a progressive neurodegenerative disorder caused by expansion of the polyglutamine repeat in the ataxin-3 protein. Expression of mutant ataxin-3... Show moreBackground: Spinocerebellar ataxia type 3 (SCA3) is a progressive neurodegenerative disorder caused by expansion of the polyglutamine repeat in the ataxin-3 protein. Expression of mutant ataxin-3 is known to result in transcriptional dysregulation, which can contribute to the cellular toxicity and neurodegeneration. Since the exact causative mechanisms underlying this process have not been fully elucidated, gene expression analyses in brains of transgenic SCA3 mouse models may provide useful insights.Methods: Here we characterised the MJD84.2 SCA3 mouse model expressing the mutant human ataxin-3 gene using a multi-omics approach on brain and blood. Gene expression changes in brainstem, cerebellum, striatum and cortex were used to study pathological changes in brain, while blood gene expression and metabolites/lipids levels were examined as potential biomarkers for disease.Results: Despite normal motor performance at 17.5 months of age, transcriptional changes in brain tissue of the SCA3 mice were observed. Most transcriptional changes occurred in brainstem and striatum, whilst cerebellum and cortex were only modestly affected. The most significantly altered genes in SCA3 mouse brain were Tmc3, Zfp488, Cart, and Chdh. Based on the transcriptional changes, a-adrenergic and CREB pathways were most consistently altered for combined analysis of the four brain regions. When examining individual brain regions, axon guidance and synaptic transmission pathways were most strongly altered in striatum, whilst brainstem presented with strongest alterations in the pi-3 k cascade and cholesterol biosynthesis pathways. Similar to other neurodegenerative diseases, reduced levels of tryptophan and increased levels of ceramides, di- and triglycerides were observed in SCA3 mouse blood.Conclusions: The observed transcriptional changes in SCA3 mouse brain reveal parallels with previous reported neuropathology in patients, but also shows brain region specific effects as well as involvement of adrenergic signalling and CREB pathway changes in SCA3. Importantly, the transcriptional changes occur prior to onset of motor- and coordination deficits. Show less
Toonen, L.J.A.; Overzier, M.; Evers, M.M.; Leon, L.G.; Zeeuw, S.A.J. van der; Mei, H.L.; ... ; Roon-Mom, W.M.C. van 2018
Hereditary cerebral hemorrhage with amyloidosis-Dutch type (HCHWA-D) is an early onset hereditary form of cerebral amyloid angiopathy (CAA) caused by a point mutation resulting in an amino acid... Show moreHereditary cerebral hemorrhage with amyloidosis-Dutch type (HCHWA-D) is an early onset hereditary form of cerebral amyloid angiopathy (CAA) caused by a point mutation resulting in an amino acid change (NP_000475.1:p.Glu693Gln) in the amyloid precursor protein (APP). Post-mortem frontal and occipital cortical brain tissue from nine patients and nine age-related controls was used for RNA sequencing to identify biological pathways affected in HCHWA-D. Although previous studies indicated that pathology is more severe in the occipital lobe in HCHWA-D compared to the frontal lobe, the current study showed similar changes in gene expression in frontal and occipital cortex and the two brain regions were pooled for further analysis. Significantly altered pathways were analyzed using gene set enrichment analysis (GSEA) on 2036 significantly differentially expressed genes. Main pathways over-represented by down-regulated genes were related to cellular aerobic respiration (including ATP synthesis and carbon metabolism) indicating a mitochondrial dysfunction. Principal up-regulated pathways were extracellular matrix (ECM)-receptor interaction and ECM proteoglycans in relation with an increase in the transforming growth factor beta (TGF beta) signaling pathway. Comparison with the publicly available dataset from pre-symptomatic APP-E693Q transgenic mice identified overlap for the ECM-receptor interaction pathway, indicating that ECM modification is an early disease specific pathomechanism. Show less
Mastrokolias, A.; Pool, R.; Mina, E.; Hettne, K.M.; Duijn, E. van; Mast, R.C. van der; ... ; Roon-Mom, W. van 2016
Introduction Metabolic changes have been frequently associated with Huntington's disease (HD). At the same time peripheral blood represents aminimally invasive sampling avenue with little distress... Show moreIntroduction Metabolic changes have been frequently associated with Huntington's disease (HD). At the same time peripheral blood represents aminimally invasive sampling avenue with little distress to Huntington's disease patients especially when brain or other tissue samples are difficult to collect.Objectives We investigated the levels of 163 metabolites in HD patient and control serum samples in order to identify disease related changes. Additionally, we integrated the metabolomics data with our previously published next generation sequencing-based gene expression data from the same patients in order to interconnect the metabolomics changes with transcriptional alterations. Methods This analysis was performed using targeted metabolomics and flow injection electrospray ionization tandem mass spectrometry in 133 serum samples from 97 Huntington's disease patients (29 pre-symptomatic and 68 symptomatic) and 36 controls.Results By comparing HD mutation carriers with controls we identified 3 metabolites significantly changed in HD (serine and threonine and one phosphatidylcholine-PC ae C36:0) and an additional 8 phosphatidylcholines (PC aa C38:6, PC aa C36:0, PC ae C38:0, PC aa C38:0, PC ae C38:6, PC ae C42:0, PC aa C36:5 and PC ae C36:0) that exhibited a significant association with disease severity. Using workflow based exploitation of pathway databases and by integrating our metabolomics data with our gene expression data from the same patients we identified 4 deregulated phosphatidylcholine metabolism related genes (ALDH1B1, MBOAT1, MTRR and PLB1) that showed significant association with the changes in metabolite concentrations.Conclusion Our results support the notion that phosphatidylcholine metabolism is deregulated in HD blood and that these metabolite alterations are associated with specific gene expression changes. Show less
Mina, E.; Roon-Mom, W. van; Hettne, K.M.; Zwet, E. van; Goeman, J.; Neri, C.; ... ; Roos, M. 2016
Background: Huntington's disease (HD) is a devastating brain disorder with no effective treatment or cure available. The scarcity of brain tissue makes it hard to study changes in the brain and... Show moreBackground: Huntington's disease (HD) is a devastating brain disorder with no effective treatment or cure available. The scarcity of brain tissue makes it hard to study changes in the brain and impossible to perform longitudinal studies. However, peripheral pathology in HD suggests that it is possible to study the disease using peripheral tissue as a monitoring tool for disease progression and/or efficacy of novel therapies. In this study, we investigated if blood can be used to monitor disease severity and progression in brain. Since previous attempts using only gene expression proved unsuccessful, we compared blood and brain Huntington's disease signatures in a functional context.Methods: Microarray HD gene expression profiles from three brain regions were compared to the transcriptome of HD blood generated by next generation sequencing. The comparison was performed with a combination of weighted gene co-expression network analysis and literature based functional analysis (Concept Profile Analysis). Uniquely, our comparison of blood and brain datasets was not based on (the very limited) gene overlap but on the similarity between the gene annotations in four different semantic categories: "biological process", "cellular component", "molecular function" and "disease or syndrome".Results: We identified signatures in HD blood reflecting a broad pathophysiological spectrum, including alterations in the immune response, sphingolipid biosynthetic processes, lipid transport, cell signaling, protein modification, spliceosome, RNA splicing, vesicle transport, cell signaling and synaptic transmission. Part of this spectrum was reminiscent of the brain pathology. The HD signatures in caudate nucleus and BA4 exhibited the highest similarity with blood, irrespective of the category of semantic annotations used. BA9 exhibited an intermediate similarity, while cerebellum had the least similarity. We present two signatures that were shared between blood and brain: immune response and spinocerebellar ataxias.Conclusions: Our results demonstrate that HD blood exhibits dysregulation that is similar to brain at a functional level, but not necessarily at the level of individual genes. We report two common signatures that can be used to monitor the pathology in brain of HD patients in a non-invasive manner. Our results are an exemplar of how signals in blood data can be used to represent brain disorders. Our methodology can be used to study disease specific signatures in diseases where heterogeneous tissues are involved in the pathology. Show less
Mastrokolias, A.; Pool, R.; Mina, E.; Hettne, K.M.; Duijn, E. van; Mast, R.C. van der; ... ; Roon-Mom, W. van 2016
IntroductionMetabolic changes have been frequently associated with Huntington’s disease (HD). At the same time peripheral blood represents a minimally invasive sampling avenue with little distress... Show moreIntroductionMetabolic changes have been frequently associated with Huntington’s disease (HD). At the same time peripheral blood represents a minimally invasive sampling avenue with little distress to Huntington’s disease patients especially when brain or other tissue samples are difficult to collect.ObjectivesWe investigated the levels of 163 metabolites in HD patient and control serum samples in order to identify disease related changes. Additionally, we integrated the metabolomics data with our previously published next generation sequencing-based gene expression data from the same patients in order to interconnect the metabolomics changes with transcriptional alterations.MethodsThis analysis was performed using targeted metabolomics and flow injection electrospray ionization tandem mass spectrometry in 133 serum samples from 97 Huntington’s disease patients (29 pre-symptomatic and 68 symptomatic) and 36 controls.ResultsBy comparing HD mutation carriers with controls we identified 3 metabolites significantly changed in HD (serine and threonine and one phosphatidylcholine—PC ae C36:0) and an additional 8 phosphatidylcholines (PC aa C38:6, PC aa C36:0, PC ae C38:0, PC aa C38:0, PC ae C38:6, PC ae C42:0, PC aa C36:5 and PC ae C36:0) that exhibited a significant association with disease severity. Using workflow based exploitation of pathway databases and by integrating our metabolomics data with our gene expression data from the same patients we identified 4 deregulated phosphatidylcholine metabolism related genes (ALDH1B1, MBOAT1, MTRR and PLB1) that showed significant association with the changes in metabolite concentrations.ConclusionOur results support the notion that phosphatidylcholine metabolism is deregulated in HD blood and that these metabolite alterations are associated with specific gene expression changes. Show less
Akhondi, S.A.; Pons, E.; Afzal, Z.; Haagen, H. van; Becker, B.F.H.; Hettne, K.M.; ... ; Kors, J.A. 2016
We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention... Show moreWe describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistical one. For this purpose the performance of several lexical resources was assessed using Peregrine, our open-source indexing engine. We combined our dictionary-based results on the patent corpus with the results of tmChem, a chemical recognizer using a conditional random field classifier. To improve the performance of tmChem, we utilized three additional features, viz. part-of-speech tags, lemmas and word-vector clusters. When evaluated on the training data, our final system obtained an F-score of 85.21% for the CEMP task, and an accuracy of 91.53% for the CPD task. On the test set, the best system ranked sixth among 21 teams for CEMP with an F-score of 86.82%, and second among nine teams for CPD with an accuracy of 94.23%. The differences in performance between the best ensemble system and the statistical system separately were small. Show less
Akhondi, S.A.; Pons, E.; Afzal, Z.; Haagen, H. van; Becker, B.F.H.; Hettne, K.M.; ... ; Kors, J.A. 2016
High-throughput experimental methods such as medical sequencing and genome-wide association studies (GWAS) identify increasingly large numbers of potential relations between genetic variants and... Show moreHigh-throughput experimental methods such as medical sequencing and genome-wide association studies (GWAS) identify increasingly large numbers of potential relations between genetic variants and diseases. Both biological complexity (millions of potential gene-disease associations) and the accelerating rate of data production necessitate computational approaches to prioritize and rationalize potential gene-disease relations. Here, we use concept profile technology to expose from the biomedical literature both explicitly stated gene-disease relations (the explicitome) and a much larger set of implied gene-disease associations (the implicitome). Implicit relations are largely unknown to, or are even unintended by the original authors, but they vastly extend the reach of existing biomedical knowledge for identification and interpretation of gene-disease associations. The implicitome can be used in conjunction with experimental data resources to rationalize both known and novel associations. We demonstrate the usefulness of the implicitome by rationalizing known and novel gene-disease associations, including those from GWAS. To facilitate the re-use of implicit gene-disease associations, we publish our data in compliance with FAIR Data Publishing recommendations [https://www.force11.org/group/fairgroup] using nanopublications. An online tool (http://knowledge.bio) is available to explore established and potential gene-disease associations in the context of other biomedical relations. Show less
Hettne, K.M.; Thompson, M.; Haagen, H.H.H.B.M. van; Horst, E. van der; Kaliyaperumal, R.; Mina, E.; ... ; Schultes, E.A. 2016