Bacteriophages, or phages for short, are the most abundant biological entity in nature. They shape bacterial communities and are a major driving force in bacterial evolution. Their ubiquitous... Show moreBacteriophages, or phages for short, are the most abundant biological entity in nature. They shape bacterial communities and are a major driving force in bacterial evolution. Their ubiquitous nature and their potential use in medical and industrial applications make them attractive targets for fundamental and applied scientific studies. Understanding their structure and function at the molecular level is essential for understanding phage life cycles. In this thesis, I applied different cryo-EM techniques combined with advanced image processing and artificial intelligence methods to gain insight into structure and function of two bacteriophages. In both cases, these phages contain flexible elements which are essential for the infection process. While biologically highly interesting, these flexible components are especially challenging for structural studies. With the advances in computer technology and electron microscopy, researchers can now use various research methods to study different proteins and the structure and function of biological macromolecular machines. The studies presented in this thesis provide valuable insights into phages with flexible components, and provide a useful workflow for researchers with similar research topics. Show less
Transport inspectorates are looking for novel methods to identify dangerous behavior, ultimately to reduce risks associated to the movements of people and goods. We explore a data-driven approach... Show moreTransport inspectorates are looking for novel methods to identify dangerous behavior, ultimately to reduce risks associated to the movements of people and goods. We explore a data-driven approach to arrive at smart inspections of vehicles. Inspections are smart when they are performed (1) accurate, (2) automated, (3) fair, and (4) in an interpretable manner. We leverage tools from the network science and machine learning domain to encode the behavioral aspect of vehicle’s behavior. Tools used in this thesis include community detection, link prediction, and assortativity. We explore their applicability and provide technical methods. In the final chapter, we also discuss the matter of fairness in machine learning. Show less
Sewer pipes are an essential infrastructure in modern society and their proper operation is important for public health. To keep sewer pipes operational as much as possible, periodical inspections... Show moreSewer pipes are an essential infrastructure in modern society and their proper operation is important for public health. To keep sewer pipes operational as much as possible, periodical inspections for defects are performed. Instead of repairing sewer pipes when a problem becomes critical, such inspections allow municipalities to plan maintenance.Sewer pipe inspections are an attractive target for automation. While a potential improvement in terms of assessment quality and processing efficiency is generally promised by automation, in this case we would also decrease the variability which is a current problem. Besides the reasons for automating, the methods for automating are also attractive: a lot of (visual) data has been gathered over the past decades which may be used to train algorithms.This thesis compiles the results of five years of research into the possible automation of sewer pipe inspections with the tools of machine learning and computer vision. In this thesis, three distinct, yet complementary approaches to automating sewer pipe inspections are described:- Image-Based Unsupervised Anomaly Detection- Convolutional Neural Network Classification- Stereovision and Geometry Reconstruction Show less
Novel entities may pose risks to humans and the environment. The small particle size and relatively large surface area of micro- and nanoparticles (MNPs) make them capable of adsorbing other novel... Show moreNovel entities may pose risks to humans and the environment. The small particle size and relatively large surface area of micro- and nanoparticles (MNPs) make them capable of adsorbing other novel entities, leading to the formation of aggregated contamination. In this dissertation, we utilized advanced computational methods, such as molecular simulation, data mining, machine learning, and quantitative structure-activity relationship modeling. These methods were used to investigate the mechanisms of interaction between MNPs and other novel entities, the joint toxic action of MNPs and other novel entities, the factors affecting their joint toxicity to ecological species, as well as to quantitatively predict the interaction forces between MNPs and other novel entities, and the toxicity of their mixtures. The results indicate that understanding the mechanisms of interactions between novel entities and their modes of joint toxic action can provide an important theoretical basis for establishing effective risk assessment procedures to mitigate the effects of novel entities on ecosystems and human health. Furthermore, this dissertation provides important technical support and a practical basis for the quantitative prediction of the environmental behavior and toxicological effects of novel entities and their mixtures by applying various advanced in silico methods individually or in combination. Show less
This thesis looks at Artificial Intelligence (AI) and its potential to revolutionise the healthcare sector. The first part of this thesis focuses on the responsible development and validation of AI... Show moreThis thesis looks at Artificial Intelligence (AI) and its potential to revolutionise the healthcare sector. The first part of this thesis focuses on the responsible development and validation of AI-based clinical prediction algorithms, exploring the prime considerations in this process. The second part of this thesis addresses the opportunities for classical statistics and machine learning techniques for developing prediction algorithms. It also examines the performance, potential, and challenges of AI prediction algorithms for clinical practice. The conclusion states that cross-discipline collaboration, exchangeability of knowledge and results, and validation of AI for healthcare practice are essential for realising the potential of AI in healthcare. Show less
In this thesis, we examine various systems through the lens of several numerical methods. We delve into questions concerning thermalization in closed unitary systems, lattice gauge theories, and... Show moreIn this thesis, we examine various systems through the lens of several numerical methods. We delve into questions concerning thermalization in closed unitary systems, lattice gauge theories, and the intriguing properties of deep neural network phase spaces. Leveraging modern advancements in both software and hardware, we scrutinize these systems in greater detail, accessing previously unreachable regimes. Show less
This dissertation investigates the early recognition of persistent somatic symptoms (PSS) in primary care. A stepwise approach was used mapping the optimal methods for re-using primary care records... Show moreThis dissertation investigates the early recognition of persistent somatic symptoms (PSS) in primary care. A stepwise approach was used mapping the optimal methods for re-using primary care records for predictive modeling of PSS. This is important since up to 10% of the general population experiences PSS. Moreover, general practitioners (GPs) often encounter difficulties in recognizing PSS, which may delay adequate intervention, subsequently resulting in unnecessary high burden on the patient and health care system. The findings from this dissertation show that a complex interplay between factors from all biopsychosocial domains contribute to PSS-onset. Survey results show that GPs differ in their methods of PSS-registration. Many GPs indicate missing an unambiguous classification scheme and report needing more support, tools, and/or education for PSS-related consultations. Predictive modeling of different PSS-syndromes shows both overlapping and syndrome-specific predictors. Early predictive modeling of the broad spectrum of PSS shows moderate predictive accuracy based on seven approaches for candidate-predictor selection, including theory-driven and temporal and non-temporal data-driven approaches. In conclusion, this dissertation provides comprehensive evidence of the complexity of identification of PSS. Furthermore, it indicates that simple data-driven approaches could support PSS classification in primary care, although this should be combined with a multidisciplinary care approach. Show less
Radiography is an important technique to inspect objects, with applications in airports and hospitals. X-ray imaging is also essential in industry, for instance in food safety checks for the... Show moreRadiography is an important technique to inspect objects, with applications in airports and hospitals. X-ray imaging is also essential in industry, for instance in food safety checks for the presence of foreign objects. Computed tomography (CT) enables more accurate visualizations of an object in 3D, but requires more computation time. Spectral X-ray imaging is an important recent development to optimize these conflicting goals of speed and accuracy. This technique enables separation of detected X-ray photons in terms of energy. More information can be extracted from spectral images, which allows for better separation of materials. Deep learning is another important recent technique enabling machines to quickly carry out processing tasks, by training these with large volumes of data for these specific tasks.In this dissertation we present new processing methods that use spectral imaging and machine learning, with a special focus on industrial processes. We design a workflow using CT to efficiently generate large volumes of machine learning training data. In addition, we develop a compression method for efficient processing of large volumes of spectral data and two new spectral CT methods to produce more accurate reconstructions. The presented methods are designed for effective use in industry. Show less
The focus of this thesis is on the technical methods which help promote the movement towards Trustworthy AI, specifically within the Inspectorate of the Netherlands.The goal is develop and assess... Show moreThe focus of this thesis is on the technical methods which help promote the movement towards Trustworthy AI, specifically within the Inspectorate of the Netherlands.The goal is develop and assess the technical methods which are required to shift the actions of the Inspectorate to a data-driven paradigm, concretely under a supervised classification framework of machine learning.The aspect of reliability is addressed as a data quality concern, viz. missingness and noise.The aspect of fairness is addressed as a counter to bias in the selection process of inspections.The conclusion is that, whilst no complete solution has yet been suggested, it is possible to address the concerns related to data quality and data bias, culminating in well-performing classification models which are reliable and fair. Show less
Stroke is one of the leading causes of disability and death worldwide. Prevention of stroke is therefore essential. Effective prevention should be tailored to the clinical characteristics,... Show moreStroke is one of the leading causes of disability and death worldwide. Prevention of stroke is therefore essential. Effective prevention should be tailored to the clinical characteristics, lifestyle, and environment of the individual, among others. This is also known as precision prevention. An important example illustrating the need for precision prevention is the existence of sex differences in stroke occurrence. In practice, for predicting stroke risk, only traditional risk factors (such as smoking and hypertension) are included, and women-specific risk factors are not yet routinely included. As a result, women with an increased risk of stroke may be missed, which also prevents timely initiation of preventive treatments. In this thesis, I tried to lay the foundation for precision prevention of stroke in women.Part I discussed the pathophysiology underlying women-specific risk factors for stroke, and gender differences in the clinical presentation of stroke. I found that the mechanisms underlying the relationship between women-specific risk factors and stroke, in particular the relationship between migraine and cerebral infarctions, seem to be particularly significant in the childbearing phase of life.In Part II, I described how health data from the EHR can be used to develop prediction models for the risk of myocardial infarction or stroke specifically for women under 50 years of age, and found that women-specific risk factors can add value in the predictions. However, there is still a long way to go to actually implement these models in practice, such as testing them on new datasets, and complying with current laws and regulations for safe application. Show less
The learning of software design is known to be a difficult and challenging task for students. This dissertation studies different didactic approaches for learning software design to improve the way... Show moreThe learning of software design is known to be a difficult and challenging task for students. This dissertation studies different didactic approaches for learning software design to improve the way we teach students software design. The research in the dissertation questions whether we can assess software design skills, what guidance is needed for the improvement of students’ understanding of software design and how to motivate and engage students for learning software design. The research explores the following: an instrument for measuring software design skills based on design principles, the gamification of learning software design, revealing students’ software design strategies, the use of peer-reflection for uncovering the difficulties students have during software design tasks, the use of teaching assistants as bridge between the lecturer and the students, the automation of grading software designs with machine learning, guiding feedback by a pedagogical agent and a workshop for engaging students into the process of software development. The research contributes to the future education of software design. Show less
Inverse problems are problems where we want to estimate the values of certain parameters of a system given observations of the system. Such problems occur in several areas of science and... Show moreInverse problems are problems where we want to estimate the values of certain parameters of a system given observations of the system. Such problems occur in several areas of science and engineering. Inverse problems are often ill-posed, which means that the observations of the system do not uniquely define the parameters we seek to estimate, or that the solution is highly sensitive to small changes in the observation. In order to solve such problems, therefore, we need to make use of additional knowledge about the system at hand. One such prior information is given by the notion of sparsity. Sparsity refers to the knowledge that the solution to the inverse problem can be expressed as a combination of a few terms. The sparsity of a solution can be controlled explicitly or implicitly. An explicit way to induce sparsity is to minimize the number of non-zero terms in the solution. Implicit use of sparsity can be made, for e.g., by making adjustments to the algorithm used to arrive at the solution.In this thesis we studied various inverse problems that arise in different application areas, such as tomographic imaging and equation learning for biology, and showed how ideas of sparsity can be used in each case to design effective algorithms to solve such problems. Show less
The societal burden of spinal conditions is vast and continues to grow with the in- creasing prevalence of patients with spinal degenerative disease, spinal metasta- ses, and spinal infections.... Show moreThe societal burden of spinal conditions is vast and continues to grow with the in- creasing prevalence of patients with spinal degenerative disease, spinal metasta- ses, and spinal infections. Recent application of artificial intelligence in healthcare have shown great promise and similar extensions in spine surgery may improve decision-making. The purpose of this thesis was to examine the utility of predictive analytics and natural language processing in spine surgery. Show less
The aim of this thesis is to determine diagnostic performance of machine learning in differentiating between atypical cartilaginous tumor (ACT) and high-grade chondrosarcoma (CS) based on radiomic... Show moreThe aim of this thesis is to determine diagnostic performance of machine learning in differentiating between atypical cartilaginous tumor (ACT) and high-grade chondrosarcoma (CS) based on radiomic features derived from magnetic resonance imaging (MRI) and computed tomography (CT). In chapter 2, the concept of radiomics of musculoskeletal sarcomas is introduced and a systematic review on radiomic feature reproducibility and validation strategies is conducted. In chapter 3, a preliminary study is performed to investigate the performance of MRI radiomics-based machine learning in discriminating ACT from high-grade CS, using a single-center cohort, in comparison with an expert radiologist. In chapter 4, the influence of interobserver segmentation variability on the reproducibility of CT and MRI radiomic features of cartilaginous bone tumors is assessed. In chapter 5, the performance of CT radiomics-based machine learning in discriminating ACT from high-grade CS of long bones is determined and validated using independent data from a multicenter cohort, compared to an expert radiologist. In chapter 6, the performance of MRI radiomics-based machine learning in differentiating between ACT and grade II CS of long bones is determined and validated using independent data from a multicenter cohort, in comparison with an expert radiologist. Finally, in chapter 7, the main results and implications of this thesis are summarized and discussed. Show less
Despite improved surgical and adjuvant treatment options, malignant brain tumors remain non-curable to date. The thin line between treatment effectiveness and patient harms underpins the importance... Show moreDespite improved surgical and adjuvant treatment options, malignant brain tumors remain non-curable to date. The thin line between treatment effectiveness and patient harms underpins the importance of tailoring clinical management to the individual brain tumor patient. Over the past decades, the volume and complexity of clinically-derived patient data (i.e., imaging, genomics, free-text etc.) is increasing exponentially. Machine learning provides a vast range of algorithms that can learn from this data and guide clinical decision-making by providing accurate patient-level predictions. The current thesis describes several studies along the continuum of the machine learning spectrum as it applies to neurosurgical oncology. Part I investigates postoperative complications and risk factors in patients operated for a primary malignant brain tumor. Part II describes de development of a model for the prediction of individual-patient survival in glioblastoma patients. Part III encompasses the development of a natural language processing framework for automated medical text analysis. Machine learning algorithms should be considered as an extension to statistical approaches and exist along a continuum determined by how much is specified by humans and how much is learnt by the machine. Although machine learning algorithms can produce highly accurate predictions based on high-dimensional data, clinicians and researchers should interpret the clinical implications of these predictions on case-by-case basis. Show less
Image registration is the process of aligning images by finding the spatial relation between the images. Assuming two images called fixed and moving images are taken at different time, different... Show moreImage registration is the process of aligning images by finding the spatial relation between the images. Assuming two images called fixed and moving images are taken at different time, different spatial location, or via a different imaging technique, the aim of image registration is to find an optimal transformation that aligns the fixed and the moving images. Performing an automatic fast image registration with less manual finetuning can speed up numerous medical image processing procedures. In addition, an automatic quality assessment of registration can speed up this time-consuming task. In this thesis, we developed a fast learning-based image registration technique called RegNet.Predicting registration error can be useful for evaluation of registration procedures, which is important for the adoption of registration techniques in the clinic. In addition, quantitative error prediction can be helpful in improving the registration quality. In this thesis, we proposed two quality assessment mechanisms using random forests (RF) and convolutional long short term memory (ConvLSTM), in which the latter performs faster and more accurate. Show less
In this work, we attempt to answer the question: "How to learn robust and interpretable rule-based models from data for machine learning and data mining, and define their optimality?".Rules provide... Show moreIn this work, we attempt to answer the question: "How to learn robust and interpretable rule-based models from data for machine learning and data mining, and define their optimality?".Rules provide a simple form of storing and sharing information about the world. As humans, we use rules every day, such as the physician that diagnoses someone with flu, represented by "if a person has either a fever or sore throat (among others), then she has the flu.". Even though an individual rule can only describe simple events, several aggregated rules can represent more complex scenarios, such as the complete set of diagnostic rules employed by a physician.The use of rules spans many fields in computer science, and in this dissertation, we focus on rule-based models for machine learning and data mining. Machine learning focuses on learning the model that best predicts future (previously unseen) events from historical data. Data mining aims to find interesting patterns in the available data.To answer our question, we use the Minimum Description Length (MDL) principle, which allows us to define the statistical optimality of rule-based models. Furthermore, we empirically show that this formulation is highly competitive for real-world problems. Show less
Particles are omnipresent in biopharmaceutical products. In protein-based therapeutics such particles are generally associated with impurities, either derived from the drug product itself (e.g.... Show moreParticles are omnipresent in biopharmaceutical products. In protein-based therapeutics such particles are generally associated with impurities, either derived from the drug product itself (e.g. protein aggregates), or from extrinsic contaminations (e.g. cellulose fibers). These impurities can affect product stability, as well as cause adverse effects once introduced into the human body. Particulate impurities are present over a wide range of sizes (from nanometers to millimeters) making them difficult to characterize by using a single method.Novel drug products may also contain particles that act as the active pharmaceutical ingredient (e.g., living cells) or a drug delivery vehicle (e.g., lipid nanoparticles). Unwanted immunotoxicity and inconsistent in vivo functionality can result from particle instability and aggregate formation. Therefore, the efficacy and safety of these therapeutics is dependent on the particle composition, quantity and size distribution.Consequently, well-established methods are required to quantify and characterize particles in the submicron- and micron-size ranges. In this thesis, we developed new approaches which allow for comprehensive characterization of the particle populations present in biopharmaceutical products, both as impurities or as API. Furthermore, the performed work focused on comparing different particle characterization techniques to allow a better understanding of the limitations and strengths of each method applied. Show less
The ongoing increase in antimicrobial resistance combined with the low discovery of novel antibiotics is a serious threat to our health care. Genome mining has given new potential to the field of... Show moreThe ongoing increase in antimicrobial resistance combined with the low discovery of novel antibiotics is a serious threat to our health care. Genome mining has given new potential to the field of natural product discovery, as thousands of biosynthetic gene clusters (BGCs) are discovered for which the natural product is not known.Ribosomally synthesized and post-translationally modified peptides (RiPPs) represent a highly diverse class of natural products. The large number of different modifications that can be applied to a RiPP results in a large variety of chemical structures, but also stems from a large genetic variety in BGCs. As a result, no single method can effectively mine for all RiPP BGCs, making it an interesting source for new molecules.In this thesis, new methods are explored to mine genomes for the BGCs of novel RiPP variants, with a focus on discovering RiPPs that have new modifications. RRE-Finder is a new tool for the detection of RiPP Recognition Elements, domains that are often found in RiPP BGCs. DecRiPPter is another tool that employs machine learning models to discover new RiPP precursor genes encoded in the genomes. Both tools can be used to prioritize novel RiPP BGCs. Two candidate BGCs are characterized, one of which could be shown to specify a new RiPP, validating the approach. Show less
Inflammatory Bowel Diseases (IBD) such as Crohn’s disease (CD) and ulcerative colitis (UC) are chronic immunological digestive diseases with a progressive character and associated with significant... Show moreInflammatory Bowel Diseases (IBD) such as Crohn’s disease (CD) and ulcerative colitis (UC) are chronic immunological digestive diseases with a progressive character and associated with significant healthcare costs. Different solutions have been proposed such as innovation in care monitoring or implementation of electronic health (eHealth). IBD is one of many chronic diseases that could benefit from eHealth, adding smartphone applications to the toolbox for care management has the potential improve disease understanding, enhance medication adherence, improve patient-physician communications, and for earlier interventions by medical professionals when problems arise. Furthermore, the accessibility to Big Data and increased computational resources have paved the way for Artificial Intelligence (AI) to provide potential solutions for the management of prototypical complex diseases with advanced heterogeneity and alternating disease states, like IBD. In this thesis we assessed the current economic and psychosocial impact of IBD by assessing its effect on indirect costs, productivity and caregiving. Furthermore, we observed if we can proactively identify IBD patients’ needs using eHealth and Artificial Intelligence. Lastly, we analyze the impact of monitoring IBD patients using eHealth interventions in order to facilitate the delivery of high-value care. Show less