This thesis focuses on data found in the field of computational drug discovery. New insight can be obtained by applying machine learning in various ways and in a variety of domains. Two studies... Show moreThis thesis focuses on data found in the field of computational drug discovery. New insight can be obtained by applying machine learning in various ways and in a variety of domains. Two studies delved into the application of proteochemometrics (PCM), a machine learning technique that can be used to find relations in protein-ligand bioactivity data and then predict using a virtual screen whether compounds that had never been tested on a particular protein, or set of proteins. With this, sets of compounds were suggested for experimental validation that were significant in a myriad of ways. Another study investigated the mutational patterns in cancer, applying a large dataset of mutation data and identifying several motifs in G protein-coupled receptors. The thesis also contains the work done on the Papyrus dataset, a large scale bioactivity dataset that focuses on standardising data for computational drug discovery and providing an out-of-the-box set that can be used in a variety of settings. Show less
Contrary to common belief, sign languages are distinct across different communities and cultures, evolving organically through interactions among deaf people, rather than being based on spoken... Show moreContrary to common belief, sign languages are distinct across different communities and cultures, evolving organically through interactions among deaf people, rather than being based on spoken languages. Each sign language has its own grammar, vocabulary, and cultural nuances, with variations even within a single country, showcasing the diverse communication methods within the deaf community. Deaf individuals often face encouragement to use spoken language techniques like lipreading or text communication, highlighting a bias towards spoken languages. This is compounded by the lack of sign languages in linguistic technologies, emphasizing the need for more inclusive research and development. This dissertation aims to address this gap using machine and deep learning to improve sign language processing and recognition. It covers six chapters, introducing methods for video-based sign annotation, webcam-based sign language dictionary search, and ranking systems for sign suggestions. It also explores tools for visualizing and comparing sign language variation, contributing valuable resources to linguistic research. Show less
Bacteriophages, or phages for short, are the most abundant biological entity in nature. They shape bacterial communities and are a major driving force in bacterial evolution. Their ubiquitous... Show moreBacteriophages, or phages for short, are the most abundant biological entity in nature. They shape bacterial communities and are a major driving force in bacterial evolution. Their ubiquitous nature and their potential use in medical and industrial applications make them attractive targets for fundamental and applied scientific studies. Understanding their structure and function at the molecular level is essential for understanding phage life cycles. In this thesis, I applied different cryo-EM techniques combined with advanced image processing and artificial intelligence methods to gain insight into structure and function of two bacteriophages. In both cases, these phages contain flexible elements which are essential for the infection process. While biologically highly interesting, these flexible components are especially challenging for structural studies. With the advances in computer technology and electron microscopy, researchers can now use various research methods to study different proteins and the structure and function of biological macromolecular machines. The studies presented in this thesis provide valuable insights into phages with flexible components, and provide a useful workflow for researchers with similar research topics. Show less
The recent surge in deployment and use of generative machine learning models has sparked an interest in the relationships between AI and creativity, or more specifically into the question and... Show moreThe recent surge in deployment and use of generative machine learning models has sparked an interest in the relationships between AI and creativity, or more specifically into the question and debate of whether machines can exhibit human-level creativity. This is by no means a new discussion, going back in time decades if not centuries. The debate was approached from multiple angles, and a general consensus was not yet reached. In this position paper, we present the long-standing debate as it formed across various fields such as cognitive science, philosophy, and computing, approaching it mainly from a historical perspective. Along the way we identify how the various views relate to recent developments in machine learning models and argue our own position regarding the question of whether machines can exhibit human-level creativity. As such we aim to involve computer scientists and AI practitioners into the ongoing debate. Show less
This thesis looks at Artificial Intelligence (AI) and its potential to revolutionise the healthcare sector. The first part of this thesis focuses on the responsible development and validation of AI... Show moreThis thesis looks at Artificial Intelligence (AI) and its potential to revolutionise the healthcare sector. The first part of this thesis focuses on the responsible development and validation of AI-based clinical prediction algorithms, exploring the prime considerations in this process. The second part of this thesis addresses the opportunities for classical statistics and machine learning techniques for developing prediction algorithms. It also examines the performance, potential, and challenges of AI prediction algorithms for clinical practice. The conclusion states that cross-discipline collaboration, exchangeability of knowledge and results, and validation of AI for healthcare practice are essential for realising the potential of AI in healthcare. Show less
This dissertation investigates the early recognition of persistent somatic symptoms (PSS) in primary care. A stepwise approach was used mapping the optimal methods for re-using primary care records... Show moreThis dissertation investigates the early recognition of persistent somatic symptoms (PSS) in primary care. A stepwise approach was used mapping the optimal methods for re-using primary care records for predictive modeling of PSS. This is important since up to 10% of the general population experiences PSS. Moreover, general practitioners (GPs) often encounter difficulties in recognizing PSS, which may delay adequate intervention, subsequently resulting in unnecessary high burden on the patient and health care system. The findings from this dissertation show that a complex interplay between factors from all biopsychosocial domains contribute to PSS-onset. Survey results show that GPs differ in their methods of PSS-registration. Many GPs indicate missing an unambiguous classification scheme and report needing more support, tools, and/or education for PSS-related consultations. Predictive modeling of different PSS-syndromes shows both overlapping and syndrome-specific predictors. Early predictive modeling of the broad spectrum of PSS shows moderate predictive accuracy based on seven approaches for candidate-predictor selection, including theory-driven and temporal and non-temporal data-driven approaches. In conclusion, this dissertation provides comprehensive evidence of the complexity of identification of PSS. Furthermore, it indicates that simple data-driven approaches could support PSS classification in primary care, although this should be combined with a multidisciplinary care approach. Show less
Stroke is one of the leading causes of disability and death worldwide. Prevention of stroke is therefore essential. Effective prevention should be tailored to the clinical characteristics,... Show moreStroke is one of the leading causes of disability and death worldwide. Prevention of stroke is therefore essential. Effective prevention should be tailored to the clinical characteristics, lifestyle, and environment of the individual, among others. This is also known as precision prevention. An important example illustrating the need for precision prevention is the existence of sex differences in stroke occurrence. In practice, for predicting stroke risk, only traditional risk factors (such as smoking and hypertension) are included, and women-specific risk factors are not yet routinely included. As a result, women with an increased risk of stroke may be missed, which also prevents timely initiation of preventive treatments. In this thesis, I tried to lay the foundation for precision prevention of stroke in women.Part I discussed the pathophysiology underlying women-specific risk factors for stroke, and gender differences in the clinical presentation of stroke. I found that the mechanisms underlying the relationship between women-specific risk factors and stroke, in particular the relationship between migraine and cerebral infarctions, seem to be particularly significant in the childbearing phase of life.In Part II, I described how health data from the EHR can be used to develop prediction models for the risk of myocardial infarction or stroke specifically for women under 50 years of age, and found that women-specific risk factors can add value in the predictions. However, there is still a long way to go to actually implement these models in practice, such as testing them on new datasets, and complying with current laws and regulations for safe application. Show less
Despite improved surgical and adjuvant treatment options, malignant brain tumors remain non-curable to date. The thin line between treatment effectiveness and patient harms underpins the importance... Show moreDespite improved surgical and adjuvant treatment options, malignant brain tumors remain non-curable to date. The thin line between treatment effectiveness and patient harms underpins the importance of tailoring clinical management to the individual brain tumor patient. Over the past decades, the volume and complexity of clinically-derived patient data (i.e., imaging, genomics, free-text etc.) is increasing exponentially. Machine learning provides a vast range of algorithms that can learn from this data and guide clinical decision-making by providing accurate patient-level predictions. The current thesis describes several studies along the continuum of the machine learning spectrum as it applies to neurosurgical oncology. Part I investigates postoperative complications and risk factors in patients operated for a primary malignant brain tumor. Part II describes de development of a model for the prediction of individual-patient survival in glioblastoma patients. Part III encompasses the development of a natural language processing framework for automated medical text analysis. Machine learning algorithms should be considered as an extension to statistical approaches and exist along a continuum determined by how much is specified by humans and how much is learnt by the machine. Although machine learning algorithms can produce highly accurate predictions based on high-dimensional data, clinicians and researchers should interpret the clinical implications of these predictions on case-by-case basis. Show less
Aims: The aim of this study is to develop and validate a deep learning (DL) methodology capable of automated and accurate segmentation of intravascular ultrasound (IVUS) image sequences in real... Show moreAims: The aim of this study is to develop and validate a deep learning (DL) methodology capable of automated and accurate segmentation of intravascular ultrasound (IVUS) image sequences in real-time. Methods and results: IVUS segmentation was performed by two experts who manually annotated the external elastic membrane (EEM) and lumen borders in the end-diastolic frames of 197 IVUS sequences portraying the native coronary arteries of 65 patients. The IVUS sequences of 177 randomly-selected vessels were used to train and optimise a novel DL model for the segmentation of IVUS images. Validation of the developed methodology was performed in 20 vessels using the estimations of two expert analysts as the reference standard. The mean difference for the EEM, lumen and plaque area between the DL-methodology and the analysts was <0.23mm2 (standard deviation <0.85mm2), while the Hausdorff and mean distance differences for the EEM and lumen borders was <0.19 mm (standard deviation<0.17 mm). The agreement between DL and experts was similar to experts' agreement (Williams Index ranges: 0.754-1.061) with similar results in frames portraying calcific plaques or side branches. Conclusions: The developed DL-methodology appears accurate and capable of segmenting high-resolution realworld IVUS datasets. These features are expected to facilitate its broad adoption and enhance the applications of IVUS in clinical practice and research. Show less
Dotinga, M.; Dijk, J.D. van; Vendel, B.N.; Slump, C.H.; Portman, A.T.; Dalen, J.A. van 2021
Purpose Our aim was to develop and validate a machine learning (ML)-based approach for interpretation of I-123 FP-CIT SPECT scans to discriminate Parkinson's disease (PD) from non-PD and to... Show morePurpose Our aim was to develop and validate a machine learning (ML)-based approach for interpretation of I-123 FP-CIT SPECT scans to discriminate Parkinson's disease (PD) from non-PD and to determine its generalizability and clinical value in two centers.Methods We retrospectively included 210 consecutive patients who underwent I-123 FP-CIT SPECT imaging and had a clinically confirmed diagnosis. Linear support vector machine (SVM) was used to build a classification model to discriminate PD from non-PD based on I-123-FP-CIT striatal uptake ratios, age and gender of 90 patients. The model was validated on unseen data from the same center where the model was developed (n = 40) and consecutively on data from a different center (n = 80). Prediction performance was assessed and compared to the scan interpretation by expert physicians.Results Testing the derived SVM model on the unseen dataset (n = 40) from the same center resulted in an accuracy of 95.0%, sensitivity of 96.0% and specificity of 93.3%. This was identical to the classification accuracy of nuclear medicine physicians. The model was generalizable towards the other center as prediction performance did not differ thereby obtaining an accuracy of 82.5%, sensitivity of 88.5% and specificity of 71.4% (p = NS). This was comparable to that of nuclear medicine physicians (p = NS).Conclusion ML-based interpretation of I-123-FP-CIT scans results in accurate discrimination of PD from non-PD similar to visual assessment in both centers. The derived SVM model is therefore generalizable towards centers using comparable acquisition and image processing methods and implementation as diagnostic aid in clinical practice is encouraged. Show less
Andras, I.; Mazzone, E.; Leeuwen, F.W.B. van; Naeyer, G. de; Oosterom, M.N. van; Beato, S.; ... ; Mottrie, A. 2019
Artificial intelligence (AI) has transformed key aspects of human life. Machine learning (ML), which is a subset of AI wherein machines autonomously acquire information by extracting patterns from... Show moreArtificial intelligence (AI) has transformed key aspects of human life. Machine learning (ML), which is a subset of AI wherein machines autonomously acquire information by extracting patterns from large databases, has been increasingly used within the medical community, and specifically within the domain of cardiovascular diseases. In this review, we present a brief overview of ML methodologies that are used for the construction of inferential and predictive data-driven models. We highlight several domains of ML application such as echocardiography, electrocardiography, and recently developed non-invasive imaging modalities such as coronary artery calcium scoring and coronary computed tomography angiography. We conclude by reviewing the limitations associated with contemporary application of ML algorithms within the cardiovascular disease field. Show less
Background: Psychiatric disorders are highly heterogeneous, defined based on symptoms with little connection to potential underlying biological mechanisms. A possible approach to dissect biological... Show moreBackground: Psychiatric disorders are highly heterogeneous, defined based on symptoms with little connection to potential underlying biological mechanisms. A possible approach to dissect biological heterogeneity is to look for biologically meaningful subtypes. A recent study Drysdale et al. (2017) showed promising results along this line by simultaneously using resting state fMRI and clinical data and identified four distinct subtypes of depression with different clinical profiles and abnormal resting state fMRI connectivity. These subtypes were predictive of treatment response to transcranial magnetic stimulation therapy.Objective: Here, we attempted to replicate the procedure followed in the Drysdale a al. study and their findings in a different clinical population and a more heterogeneous sample of 187 participants with depression and anxiety. We aimed to answer the following questions: 1) Using the same procedure, can we find a statistically significant and reliable relationship between brain connectivity and clinical symptoms? 2) Is the observed relationship similar to the one found in the original study? 3) Can we identify distinct and reliable subtypes? 4) Do they have similar clinical profiles as the subtypes identified in the original study?Methods: We followed the original procedure as closely as possible, including a canonical correlation analysis to find a low dimensional representation of clinically relevant resting state fMRI features, followed by hierarchical clustering to identify subtypes. We extended the original procedure using additional statistical tests, to test the statistical significance of the relationship between resting state fMRI and clinical data, and the existence of distinct subtypes. Furthermore, we examined the stability of the whole procedure using resampling.Results and conclusion: As in the original study, we found extremely high canonical correlations between functional connectivity and clinical symptoms, and an optimal three-cluster solution. However, neither canonical correlations nor clusters were statistically significant. On the basis of our extensive evaluations of the analysis methodology used and within the limits of comparison of our sample relative to the sample used in Drysdale et al., we argue that the evidence for the existence of the distinct resting state connectivity-based subtypes of depression should be interpreted with caution. Show less
Part of a series of digital guest lectures from Leiden University scholars for use in secondary school education. For more information, see:https://www.universiteitleiden.nl/gastlessen/cursussen... Show morePart of a series of digital guest lectures from Leiden University scholars for use in secondary school education. For more information, see:https://www.universiteitleiden.nl/gastlessen/cursussen/digitale-gastlessen/artificial-intelligence Show less
This dissertation mainly focuses on interdisciplinary approaches for biomedical knowledge discovery. This required special efforts in developing systematic strategies to integrate various data... Show moreThis dissertation mainly focuses on interdisciplinary approaches for biomedical knowledge discovery. This required special efforts in developing systematic strategies to integrate various data sources and techniques, leading to improved discovery of mechanistic insights on human diseases. Chapter one looks at the possibility in which combining various bioinformatics-based strategies can significantly improve the characterization of the OPMD mouse model. We discuss that this approach in knowledge discovery, on the basis of our extensive analysis, helped us to shed some light on how this model system relates to OPMD pathophysiology in human. In Chapter two, we expand on this combinatory approach by conducting a cross-species data analysis. In this study, we have looked for common patterns that emerge by assessing the transcriptome data from three OPMD model systems and patients. This strategy led to unravelling the most prominent molecular pathway involved in OPMD pathology. The third chapter achieves a similar goal to identify similar molecular and pathophysiological features between OPMD and the common process of skeletal muscle ageing. Engaging in a study in which the focus was made on the universality of biological processes, in the light of evolutionary mechanisms and common functional features, led to novel discoveries. This work helped us uncover remarkable insights on molecular mechanisms of ageing muscles and protein aggregation. Chapters four and five take a different route by tackling the field of computational biology. These chapters aim to extend network inference by providing novel strategies for the exploitation and integration of multiple data sources. We show that these developments allow us to infer more robust regulatory mechanisms to be identified while translations and predictions are made across very different datasets, platforms, and organisms. Finally, the dissertation is concluded by providing an outlook on ways the field of systems biology can evolve in order to offer enhanced, diversified and robust strategies for knowledge discovery. Show less