Purpose: Machine Learning (ML) algorithms represent an interesting alternative to maximum a posteriori Bayesian estimators (MAP-BE) for tacrolimus AUC estimation, but it is not known if training an... Show morePurpose: Machine Learning (ML) algorithms represent an interesting alternative to maximum a posteriori Bayesian estimators (MAP-BE) for tacrolimus AUC estimation, but it is not known if training an ML model using a lower number of full pharmacokinetic (PK) profiles (="true" reference AUC) provides better performances than using a larger dataset of less accurate AUC estimates. The objectives of this study were: to develop and benchmark ML algorithms trained using full PK profiles to estimate MeltDose (R)-tacrolimus individual AUCs using 2 or 3 blood concentrations; and to compare their performance to MAP-BE. Methods: Data from liver (n = 113) and kidney (n = 97) transplant recipients involved in MeltDose-tacrolimus PK studies were used for the training and evaluation of ML algorithms. "True" AUC0-24 h was calculated for each patient using the trapezoidal rule on the full PK profile. ML algorithms were trained to estimate tacrolimus true AUC using 2 or 3 blood concentrations. Performances were evaluated in 2 external sets of 16 (renal) and 48 (liver) transplant patients. Results: Best estimation performances were obtained with the MARS algorithm and the following limited sampling strategies (LSS): predose (0), 8, and 12 h post-dose (rMPE = -1.28%, rRMSE = 7.57%), or 0 and 12 h (rMPE = -1.9%, rRMSE = 10.06%). In the external dataset, the performances of the final ML algorithms based on two samples in kidney (rMPE = -3.1%, rRMSE = 11.1%) or liver transplant recipients (rMPE = -3.4%, rRMSE = 9.86%) were as good as or better than those of MAP-BEs based on three time points. Conclusion: The MARS ML models developed using "true" MeltDose (R)-tacrolimus AUCs yielded accurate individual estimations using only two blood concentrations. Show less
Tannemaat, M.R.; Kefalas, M.; Geraedts, V.J.; Remijn-Nelissen, L.; Verschuuren, A.J.M.; Koch, M.; ... ; Bäck, T.H.W. 2022
ObjectiveDistinguishing normal, neuropathic and myopathic electromyography (EMG) traces can be challenging. We aimed to create an automated time series classification algorithm.MethodsEMGs of... Show moreObjectiveDistinguishing normal, neuropathic and myopathic electromyography (EMG) traces can be challenging. We aimed to create an automated time series classification algorithm.MethodsEMGs of healthy controls (HC, n = 25), patients with amyotrophic lateral sclerosis (ALS, n = 20) and inclusion body myositis (IBM, n = 20), were retrospectively selected based on longitudinal clinical follow-up data (ALS and HC) or muscle biopsy (IBM). A machine learning pipeline was applied based on 5-second EMG fragments of each muscle. Diagnostic yield expressed as area under the curve (AUC) of a receiver-operator characteristics curve, accuracy, sensitivity, and specificity were determined per muscle (muscle-level) and per patient (patient-level).ResultsDiagnostic yield of the classification ALS vs. HC was: AUC 0.834 ± 0.014 at muscle-level and 0.856 ± 0.009 at patient-level. For the classification HC vs. IBM, AUC was 0.744 ± 0.043 at muscle-level and 0.735 ± 0.029 at patient-level. For the classification ALS vs. IBM, AUC was 0.569 ± 0.024 at muscle-level and 0.689 ± 0.035 at patient-level.ConclusionsAn automated time series classification algorithm can distinguish EMGs from healthy individuals from those of patients with ALS with a high diagnostic yield. Using longer EMG fragments with different levels of muscle activation may improve performance. Show less
The learning of software design is known to be a difficult and challenging task for students. This dissertation studies different didactic approaches for learning software design to improve the way... Show moreThe learning of software design is known to be a difficult and challenging task for students. This dissertation studies different didactic approaches for learning software design to improve the way we teach students software design. The research in the dissertation questions whether we can assess software design skills, what guidance is needed for the improvement of students’ understanding of software design and how to motivate and engage students for learning software design. The research explores the following: an instrument for measuring software design skills based on design principles, the gamification of learning software design, revealing students’ software design strategies, the use of peer-reflection for uncovering the difficulties students have during software design tasks, the use of teaching assistants as bridge between the lecturer and the students, the automation of grading software designs with machine learning, guiding feedback by a pedagogical agent and a workshop for engaging students into the process of software development. The research contributes to the future education of software design. Show less
Inverse problems are problems where we want to estimate the values of certain parameters of a system given observations of the system. Such problems occur in several areas of science and... Show moreInverse problems are problems where we want to estimate the values of certain parameters of a system given observations of the system. Such problems occur in several areas of science and engineering. Inverse problems are often ill-posed, which means that the observations of the system do not uniquely define the parameters we seek to estimate, or that the solution is highly sensitive to small changes in the observation. In order to solve such problems, therefore, we need to make use of additional knowledge about the system at hand. One such prior information is given by the notion of sparsity. Sparsity refers to the knowledge that the solution to the inverse problem can be expressed as a combination of a few terms. The sparsity of a solution can be controlled explicitly or implicitly. An explicit way to induce sparsity is to minimize the number of non-zero terms in the solution. Implicit use of sparsity can be made, for e.g., by making adjustments to the algorithm used to arrive at the solution.In this thesis we studied various inverse problems that arise in different application areas, such as tomographic imaging and equation learning for biology, and showed how ideas of sparsity can be used in each case to design effective algorithms to solve such problems. Show less
The societal burden of spinal conditions is vast and continues to grow with the in- creasing prevalence of patients with spinal degenerative disease, spinal metasta- ses, and spinal infections.... Show moreThe societal burden of spinal conditions is vast and continues to grow with the in- creasing prevalence of patients with spinal degenerative disease, spinal metasta- ses, and spinal infections. Recent application of artificial intelligence in healthcare have shown great promise and similar extensions in spine surgery may improve decision-making. The purpose of this thesis was to examine the utility of predictive analytics and natural language processing in spine surgery. Show less
Background: Timely identification of deteriorating COVID-19 patients is needed to guide changes in clinical management and admission to intensive care units (ICUs). There is significant concern... Show moreBackground: Timely identification of deteriorating COVID-19 patients is needed to guide changes in clinical management and admission to intensive care units (ICUs). There is significant concern that widely used Early warning scores (EWSs) underestimate illness severity in COVID-19 patients and therefore, we developed an early warning model specifically for COVID-19 patients. Methods: We retrospectively collected electronic medical record data to extract predictors and used these to fit a random forest model. To simulate the situation in which the model would have been developed after the first and implemented during the second COVID-19 `wave' in the Netherlands, we performed a temporal validation by splitting all included patients into groups admitted before and after August 1, 2020. Furthermore, we propose a method for dynamic model updating to retain model performance over time. We evaluated model discrimination and calibration, performed a decision curve analysis, and quantified the importance of predictors using SHapley Additive exPlanations values. Results: We included 3514 COVID-19 patient admissions from six Dutch hospitals between February 2020 and May 2021, and included a total of 18 predictors for model fitting. The model showed a higher discriminative performance in terms of partial area under the receiver operating characteristic curve (0.82 [0.80-0.84]) compared to the National early warning score (0.72 [0.69-0.74]) and the Modified early warning score (0.67 [0.65-0.69]), a greater net benefit over a range of clinically relevant model thresholds, and relatively good calibration (intercept = 0.03 [- 0.09 to 0.14], slope = 0.79 [0.73-0.86]). Conclusions: This study shows the potential benefit of moving from early warning models for the general inpatient population to models for specific patient groups. Further (independent) validation of the model is needed. Show less
Background: While the Glasgow coma scale (GCS) is one of the strongest outcome predictors, the current classification of traumatic brain injury (TBI) as'mild" 'moderate'or'severe' based on this... Show moreBackground: While the Glasgow coma scale (GCS) is one of the strongest outcome predictors, the current classification of traumatic brain injury (TBI) as'mild" 'moderate'or'severe' based on this fails to capture enormous heterogeneity in pathophysiology and treatment response. We hypothesized that data-driven characterization of TBl could identify distinct endotypes and give mechanistic insights. Methods: We developed an unsupervised statistical clustering model based on a mixture of probabilistic graphs for presentation (<24 h) demographic, clinical, physiological, laboratory and imaging data to identify subgroups of TBl patients admitted to the intensive care unit in the CENTER-TBI dataset (N= 1,728). A cluster similarity index was used for robust determination of optimal cluster number. Mutual information was used to quantify feature importance and for cluster interpretation. Results: Six stable endotypes were identified with distinct GCS and composite systemic metabolic stress profiles, distinguished by GCS, blood lactate, oxygen saturation, serum creatinine, glucose, base excess, pH, arterial partial pressure of carbon dioxide, and body temperature. Notably, a cluster with 'moderate'TBI (by traditional classification) and deranged metabolic profile, had a worse outcome than a cluster with 'severe'GCS and a normal metabolic profile. Addition of cluster labels significantly improved the prognostic precision of the IMPACT (International Mission for Prognosis and Analysis of Clinical trials in TBI) extended model, for prediction of both unfavourable outcome and mortality (both p <0.001). Conclusions: Six stable and clinically distinct TBI endotypes were identified by probabilistic unsupervised clustering. In addition to presenting neurology, a profile of biochemical derangement was found to be an important distinguishing feature that was both biologically plausible and associated with outcome. Our work motivates refining current TBI classifications with factors describing metabolic stress. Such data-driven clusters suggest TBI endotypes that merit investigation to identify bespoke treatment strategies to improve care. Show less
Aims Pulmonary arterial hypertension (PAH) is a rare but serious disease associated with high mortality if left untreated. This study aims to assess the prognostic cardiac magnetic resonance (CMR)... Show moreAims Pulmonary arterial hypertension (PAH) is a rare but serious disease associated with high mortality if left untreated. This study aims to assess the prognostic cardiac magnetic resonance (CMR) features in PAH using machine learning. Methods and results Seven hundred and twenty-three consecutive treatment-naive PAH patients were identified from the ASPIRE registry; 516 were included in the training, and 207 in the validation cohort. A multilinear principal component analysis (MPCA)-based machine learning approach was used to extract mortality and survival features throughout the cardiac cycle. The features were overlaid on the original imaging using thresholding and clustering of high- and low-risk of mortality prediction values. The 1-year mortality rate in the validation cohort was 10%. Univariable Cox regression analysis of the combined short-axis and four-chamber MPCA-based predictions was statistically significant (hazard ratios: 2.1, 95% CI: 1.3, 3.4, c-index = 0.70, P = 0.002). The MPCA features improved the 1-year mortality prediction of REVEAL from c-index = 0.71 to 0.76 (P ≤ 0.001). Abnormalities in the end-systolic interventricular septum and end-diastolic left ventricle indicated the highest risk of mortality.Conclusion: The MPCA-based machine learning is an explainable time-resolved approach that allows visualization of prognostic cardiac features throughout the cardiac cycle at the population level, making this approach transparent and clinically interpretable. In addition, the added prognostic value over the REVEAL risk score and CMR volumetric measurements allows for a more accurate prediction of 1-year mortality risk in PAH. Show less
Large and complex data sets are increasingly available for research in critical care. To analyze these data, researchers use techniques commonly referred to as statistical learning or machine... Show moreLarge and complex data sets are increasingly available for research in critical care. To analyze these data, researchers use techniques commonly referred to as statistical learning or machine learning (ML). The latter is known for large successes in the field of diagnostics, for example, by identification of radiological anomalies. In other research areas, such as clustering and prediction studies, there is more discussion regarding the benefit and efficiency of ML techniques compared with statistical learning. In this viewpoint, we aim to explain commonly used statistical learning and ML techniques and provide guidance for responsible use in the case of clustering and prediction questions in critical care. Clustering studies have been increasingly popular in critical care research, aiming to inform how patients can be characterized, classified, or treated differently. An important challenge for clustering studies is to ensure and assess generalizability. This limits the application of findings in these studies toward individual patients. In the case of predictive questions, there is much discussion as to what algorithm should be used to most accurately predict outcome. Aspects that determine usefulness of ML, compared with statistical techniques, include the volume of the data, the dimensionality of the preferred model, and the extent of missing data. There are areas in which modern ML methods may be preferred. However, efforts should be made to implement statistical frameworks (e.g., for dealing with missing data or measurement error, both omnipresent in clinical data) in ML methods. To conclude, there are important opportunities but also pitfalls to consider when performing clustering or predictive studies with ML techniques. We advocate careful valuation of new data-driven findings. More interaction is needed between the engineer mindset of experts in ML methods, the insight in bias of epidemiologists, and the probabilistic thinking of statisticians to extract as much information and knowledge from data as possible, while avoiding harm. Show less
Wall, H.E.C. van der; Hassing, G.J.; Doll, R.J.; Westen, G.J.P. van; Cohen, A.F.; Selder, J.L.; ... ; Gal, P. 2022
ObjectiveThe aim of the present study was to develop a neural network to characterize the effect of aging on the ECG in healthy volunteers. Moreover, the impact of the various ECG features on aging... Show moreObjectiveThe aim of the present study was to develop a neural network to characterize the effect of aging on the ECG in healthy volunteers. Moreover, the impact of the various ECG features on aging was evaluated.Methods & resultsA total of 6228 healthy subjects without structural heart disease were included in this study. A neural network regression model was created to predict age of the subjects based on their ECG; 577 parameters derived from a 12‑lead ECG of each subject were used to develop and validate the neural network; A tenfold cross-validation was performed, using 118 subjects for validation each fold. Using SHapley Additive exPlanations values the impact of the individual features on the prediction of age was determined. Of 6228 subjects tested, 1808 (29%) were females and mean age was 34 years, range 18–75 years. Physiologic age was estimated as a continuous variable with an average error of 6.9 ± 5.6 years (R2 = 0.72 ± 0.04) . The correlation was slightly stronger for men (R2 = 0.74) than for women (R2 = 0.66). The most important features on the prediction of physiologic age were T wave morphology indices in leads V4 and V5, and P wave amplitude in leads AVR and II.ConclusionThe application of machine learning to the ECG using a neural network regression model, allows accurate estimation of physiologic cardiac age. This technique could be used to pick up subtle age-related cardiac changes, but also estimate the reversing of these age-associated effects by administered treatments. Show less
Background and aims: Accurate classification of plaque composition is essential for treatment planning. Intravascular ultrasound (IVUS) has limited efficacy in assessing tissue types, while near... Show moreBackground and aims: Accurate classification of plaque composition is essential for treatment planning. Intravascular ultrasound (IVUS) has limited efficacy in assessing tissue types, while near-infrared spectroscopy (NIRS) provides complementary information to IVUS but lacks depth information. The aim of this study is to train and assess the efficacy of a machine learning classifier for plaque component classification that relies on IVUS echogenicity and NIRS-signal, using histology as reference standard. Methods: Matched NIRS-IVUS and histology images from 15 cadaveric human coronary arteries were analyzed (10 vessels were used for training and 5 for testing). Fibrous/pathological intimal thickening (F-PIT), early necrotic core (ENC), late necrotic core (LNC), and calcific tissue regions-of-interest were detected on histology and superimposed onto IVUS frames. The pixel intensities of these tissue types from the training set were used to train a J48 classifier for plaque characterization (ECHO-classification). To aid differentiation of F-PIT from necrotic cores, the NIRS-signal was used to classify non-calcific pixels outside yellow-spot regions as F-PIT (ECHO-NIRS classification). The performance of ECHO and ECHO-NIRS classifications were validated against histology. Results: 262 matched frames were included in the analysis (162 constituted the training set and 100 the test set). The pixel intensities of F-PIT and ENC were similar and thus these two tissues could not be differentiated by echogenicity. With ENC and LNC as a single class, ECHO-classification showed good agreement with histology for detecting calcific and F-PIT tissues but had poor efficacy for necrotic cores (recall 0.59 and precision 0.29). Similar results were found when F-PIT and ENC were treated as a single class (recall and precision for LNC 0.78 and 0.33, respectively). ECHO-NIRS classification improved necrotic core and LNC detection, resulting in an increase of the overall accuracy of both models, from 81.4% to 91.8%, and from 87.9% to 94.7%, respectively. Comparable performance of the two models was seen in the test set where the overall accuracy of ECHO-NIRS classification was 95.0% and 95.5%, respectively. Conclusions: The combination of echogenicity with NIRS-signal appears capable of overcoming limitations of echogenicity, enabling more accurate characterization of plaque components. Show less
The aim of this thesis is to determine diagnostic performance of machine learning in differentiating between atypical cartilaginous tumor (ACT) and high-grade chondrosarcoma (CS) based on radiomic... Show moreThe aim of this thesis is to determine diagnostic performance of machine learning in differentiating between atypical cartilaginous tumor (ACT) and high-grade chondrosarcoma (CS) based on radiomic features derived from magnetic resonance imaging (MRI) and computed tomography (CT). In chapter 2, the concept of radiomics of musculoskeletal sarcomas is introduced and a systematic review on radiomic feature reproducibility and validation strategies is conducted. In chapter 3, a preliminary study is performed to investigate the performance of MRI radiomics-based machine learning in discriminating ACT from high-grade CS, using a single-center cohort, in comparison with an expert radiologist. In chapter 4, the influence of interobserver segmentation variability on the reproducibility of CT and MRI radiomic features of cartilaginous bone tumors is assessed. In chapter 5, the performance of CT radiomics-based machine learning in discriminating ACT from high-grade CS of long bones is determined and validated using independent data from a multicenter cohort, compared to an expert radiologist. In chapter 6, the performance of MRI radiomics-based machine learning in differentiating between ACT and grade II CS of long bones is determined and validated using independent data from a multicenter cohort, in comparison with an expert radiologist. Finally, in chapter 7, the main results and implications of this thesis are summarized and discussed. Show less
Despite improved surgical and adjuvant treatment options, malignant brain tumors remain non-curable to date. The thin line between treatment effectiveness and patient harms underpins the importance... Show moreDespite improved surgical and adjuvant treatment options, malignant brain tumors remain non-curable to date. The thin line between treatment effectiveness and patient harms underpins the importance of tailoring clinical management to the individual brain tumor patient. Over the past decades, the volume and complexity of clinically-derived patient data (i.e., imaging, genomics, free-text etc.) is increasing exponentially. Machine learning provides a vast range of algorithms that can learn from this data and guide clinical decision-making by providing accurate patient-level predictions. The current thesis describes several studies along the continuum of the machine learning spectrum as it applies to neurosurgical oncology. Part I investigates postoperative complications and risk factors in patients operated for a primary malignant brain tumor. Part II describes de development of a model for the prediction of individual-patient survival in glioblastoma patients. Part III encompasses the development of a natural language processing framework for automated medical text analysis. Machine learning algorithms should be considered as an extension to statistical approaches and exist along a continuum determined by how much is specified by humans and how much is learnt by the machine. Although machine learning algorithms can produce highly accurate predictions based on high-dimensional data, clinicians and researchers should interpret the clinical implications of these predictions on case-by-case basis. Show less