Objectives To identify highly ranked features related to clinicians' diagnosis of clinically relevant knee OA. Methods General practitioners (GPs) and secondary care physicians (SPs) were recruited... Show moreObjectives To identify highly ranked features related to clinicians' diagnosis of clinically relevant knee OA. Methods General practitioners (GPs) and secondary care physicians (SPs) were recruited to evaluate 5-10 years follow-up clinical and radiographic data of knees from the CHECK cohort for the presence of clinically relevant OA. GPs and SPs were gathered in pairs; each pair consisted of one GP and one SP, and the paired clinicians independently evaluated the same subset of knees. A diagnosis was made for each knee by the GP and SP before and after viewing radiographic data. Nested 5-fold cross-validation enhanced random forest models were built to identify the top 10 features related to the diagnosis. Results Seventeen clinician pairs evaluated 1106 knees with 139 clinical and 36 radiographic features. GPs diagnosed clinically relevant OA in 42% and 43% knees, before and after viewing radiographic data, respectively. SPs diagnosed in 43% and 51% knees, respectively. Models containing top 10 features had good performance for explaining clinicians' diagnosis with area under the curve ranging from 0.76-0.83. Before viewing radiographic data, quantitative symptomatic features (i.e. WOMAC scores) were the most important ones related to the diagnosis of both GPs and SPs; after viewing radiographic data, radiographic features appeared in the top lists for both, but seemed to be more important for SPs than GPs. Conclusions Random forest models presented good performance in explaining clinicians' diagnosis, which helped to reveal typical features of patients recognized as clinically relevant knee OA by clinicians from two different care settings. Show less
Objectives: Around 30% of patients with RA have an inadequate response to MTX. We aimed to use routine clinical and biological data to build machine learning models predicting EULAR inadequate... Show moreObjectives: Around 30% of patients with RA have an inadequate response to MTX. We aimed to use routine clinical and biological data to build machine learning models predicting EULAR inadequate response to MTX and to identify simple predictive biomarkers. Methods: Models were trained on RA patients fulfilling the 2010 ACR/EULAR criteria from the ESPOIR and Leiden EAC cohorts to predict the EULAR response at 9 months (+/- 6 months). Several models were compared on the training set using the AUROC. The best model was evaluated on an external validation cohort (tREACH). The model's predictions were explained using Shapley values to extract a biomarker of inadequate response. Results: We included 493 therapeutic sequences from ESPOIR, 239 from EAC and 138 from tREACH. The model selected DAS28, Lymphocytes, Creatininemia, Leucocytes, AST, ALT, swollen joint count and corticosteroid co-treatment as predictors. The model reached an AUROC of 0.72 [95% CI (0.63, 0.80)] on the external validation set, where 70% of patients were responders to MTX. Patients predicted as inadequate responders had only 38% [95% CI (20%, 58%)] chance to respond and using the algorithm to decide to initiate MTX would decrease inadequate-response rate from 30% to 23% [95% CI: (17%, 29%)]. A biomarker was identified in patients with moderate or high activity (DAS28 > 3.2): patients with a lymphocyte count superior to 2000 cells/mm(3) are significantly less likely to respond. Conclusion: Our study highlights the usefulness of machine learning in unveiling subgroups of inadequate responders to MTX to guide new therapeutic strategies. Further work is needed to validate this approach. Show less
Bogaards, F.A.; Gehrmann, T.; Beekman, M.; Akker, E. ben van den; Rest, O. van de; Hangelbroek, R.W.J.; ... ; Slagboom, P.E. 2022
The response to lifestyle intervention studies is often heterogeneous, especially in older adults. Subtle responses that may represent a health gain for individuals are not always detected by... Show moreThe response to lifestyle intervention studies is often heterogeneous, especially in older adults. Subtle responses that may represent a health gain for individuals are not always detected by classical health variables, stressing the need for novel biomarkers that detect intermediate changes in metabolic, inflammatory, and immunity-related health. Here, our aim was to develop and validate a molecular multivariate biomarker maximally sensitive to the individual effect of a lifestyle intervention; the Personalized Lifestyle Intervention Status (PLIS). We used H-1-NMR fasting blood metabolite measurements from before and after the 13-week combined physical and nutritional Growing Old TOgether (GOTO) lifestyle intervention study in combination with a fivefold cross-validation and a bootstrapping method to train a separate PLIS score for men and women. The PLIS scores consisted of 14 and four metabolites for females and males, respectively. Performance of the PLIS score in tracking health gain was illustrated by association of the sex-specific PLIS scores with several classical metabolic health markers, such as BMI, trunk fat%, fasting HDL cholesterol, and fasting insulin, the primary outcome of the GOTO study. We also showed that the baseline PLIS score indicated which participants respond positively to the intervention. Finally, we explored PLIS in an independent physical activity lifestyle intervention study, showing similar, albeit remarkably weaker, associations of PLIS with classical metabolic health markers. To conclude, we found that the sex-specific PLIS score was able to track the individual short-term metabolic health gain of the GOTO lifestyle intervention study. The methodology used to train the PLIS score potentially provides a useful instrument to track personal responses and predict the participant's health benefit in lifestyle interventions similar to the GOTO study. Show less
Maleki, G.; Zhuparris, A.; Koopmans, I.; Doll, R.J.; Voet, N.; Cohen, A.; ... ; Maeyer, J. de 2022
Background: Facioscapulohumeral dystrophy (FSHD) is a progressive muscle dystrophy disorder leading to significant disability. Currently, FSHD symptom severity is assessed by clinical assessments... Show moreBackground: Facioscapulohumeral dystrophy (FSHD) is a progressive muscle dystrophy disorder leading to significant disability. Currently, FSHD symptom severity is assessed by clinical assessments such as the FSHD clinical score and the Timed Up-and-Go test. These assessments are limited in their ability to capture changes continuously and the full impact of the disease on patients' quality of life. Real-world data related to physical activity, sleep, and social behavior could potentially provide additional insight into the impact of the disease and might be useful in assessing treatment effects on aspects that are important contributors to the functioning and well-being of patients with FSHD.Objective: This study investigated the feasibility of using smartphones and wearables to capture symptoms related to FSHD based on a continuous collection of multiple features, such as the number of steps, sleep, and app use. We also identified features that can be used to differentiate between patients with FSHD and non-FSHD controls.Methods: In this exploratory noninterventional study, 58 participants (n=38, 66%, patients with FSHD and n=20, 34%, non-FSHD controls) were monitored using a smartphone monitoring app for 6 weeks. On the first and last day of the study period, clinicians assessed the participants' FSHD clinical score and Timed Up-and-Go test time. Participants installed the app on their Android smartphones, were given a smartwatch, and were instructed to measure their weight and blood pressure on a weekly basis using a scale and blood pressure monitor. The user experience and perceived burden of the app on participants' smartphones were assessed at 6 weeks using a questionnaire. With the data collected, we sought to identify the behavioral features that were most salient in distinguishing the 2 groups (patients with FSHD and non-FSHD controls) and the optimal time window to perform the classification.Results: Overall, the participants stated that the app was well tolerated, but 67% (39/58) noticed a difference in battery life using all 6 weeks of data, we classified patients with FSHD and non-FSHD controls with 93% accuracy, 100% sensitivity, and 80% specificity. We found that the optimal time window for the classification is the first day of data collection and the first week of data collection, which yielded an accuracy, sensitivity, and specificity of 95.8%, 100%, and 94.4%, respectively. Features relating to smartphone acceleration, app use, location, physical activity, sleep, and call behavior were the most salient features for the classification.Conclusions: Remotely monitored data collection allowed for the collection of daily activity data in patients with FSHD and non-FSHD controls for 6 weeks. We demonstrated the initial ability to detect differences in features in patients with FSHD and non-FSHD controls using smartphones and wearables, mainly based on data related to physical and social activity. Show less
Assadi, H.; Alabed, S.; Maiter, A.; Salehi, M.; Li, R.; Ripley, D.P.; ... ; Garg, P. 2022
Background and Objectives: Interest in artificial intelligence (AI) for outcome prediction has grown substantially in recent years. However, the prognostic role of AI using advanced cardiac... Show moreBackground and Objectives: Interest in artificial intelligence (AI) for outcome prediction has grown substantially in recent years. However, the prognostic role of AI using advanced cardiac magnetic resonance imaging (CMR) remains unclear. This systematic review assesses the existing literature on AI in CMR to predict outcomes in patients with cardiovascular disease. Materials and Methods: Medline and Embase were searched for studies published up to November 2021. Any study assessing outcome prediction using AI in CMR in patients with cardiovascular disease was eligible for inclusion. All studies were assessed for compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Results: A total of 5 studies were included, with a total of 3679 patients, with 225 deaths and 265 major adverse cardiovascular events. Three methods demonstrated high prognostic accuracy: (1) three-dimensional motion assessment model in pulmonary hypertension (hazard ratio (HR) 2.74, 95%CI 1.73-4.34, p < 0.001), (2) automated perfusion quantification in patients with coronary artery disease (HR 2.14, 95%CI 1.58-2.90, p < 0.001), and (3) automated volumetric, functional, and area assessment in patients with myocardial infarction (HR 0.94, 95%CI 0.92-0.96, p < 0.001). Conclusion: There is emerging evidence of the prognostic role of AI in predicting outcomes for three-dimensional motion assessment in pulmonary hypertension, ischaemia assessment by automated perfusion quantification, and automated functional assessment in myocardial infarction. Show less
Background: There has been a rapid increase in the number of Artificial Intelligence (AI) studies of cardiac MRI (CMR) segmentation aiming to automate image analysis. However, advancement and... Show moreBackground: There has been a rapid increase in the number of Artificial Intelligence (AI) studies of cardiac MRI (CMR) segmentation aiming to automate image analysis. However, advancement and clinical translation in this field depend on researchers presenting their work in a transparent and reproducible manner. This systematic review aimed to evaluate the quality of reporting in AI studies involving CMR segmentation. Methods: MEDLINE and EMBASE were searched for AI CMR segmentation studies in April 2022. Any fully automated AI method for segmentation of cardiac chambers, myocardium or scar on CMR was considered for inclusion. For each study, compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) was assessed. The CLAIM criteria were grouped into study, dataset, model and performance description domains. Results: 209 studies published between 2012 and 2022 were included in the analysis. Studies were mainly published in technical journals (58%), with the majority (57%) published since 2019. Studies were from 37 different countries, with most from China (26%), the United States (18%) and the United Kingdom (11%). Short axis CMR images were most frequently used (70%), with the left ventricle the most commonly segmented cardiac structure (49%). Median compliance of studies with CLAIM was 67% (IQR 59-73%). Median compliance was highest for the model description domain (100%, IQR 80-100%) and lower for the study (71%, IQR 63-86%), dataset (63%, IQR 50-67%) and performance (60%, IQR 50-70%) description domains. Conclusion: This systematic review highlights important gaps in the literature of CMR studies using AI. We identified key items missing-most strikingly poor description of patients included in the training and validation of AI models and inadequate model failure analysis-that limit the transparency, reproducibility and hence validity of published AI studies. This review may support closer adherence to established frameworks for reporting standards and presents recommendations for improving the quality of reporting in this field. Show less
Schultes, E.; Roos, M.; Santos, L.O.B.D.; Guizzardi, G.; Bouwman, J.; Hankemeier, T.; ... ; Mons, B. 2022
Although all the technical components supporting fully orchestrated Digital Twins (DT) currently exist, what remains missing is a conceptual clarification and analysis of a more generalized concept... Show moreAlthough all the technical components supporting fully orchestrated Digital Twins (DT) currently exist, what remains missing is a conceptual clarification and analysis of a more generalized concept of a DT that is made FAIR, that is, universally machine actionable. This methodological overview is a first step toward this clarification. We present a review of previously developed semantic artifacts and how they may be used to compose a higher-order data model referred to here as a FAIR Digital Twin (FDT). We propose an architectural design to compose, store and reuse FDTs supporting data intensive research, with emphasis on privacy by design and their use in GDPR compliant open science. Show less
Background and Objectives: With the current advanced data-driven approach to health care, machine learning is gaining more interest. The current study investigates the added value of machine... Show moreBackground and Objectives: With the current advanced data-driven approach to health care, machine learning is gaining more interest. The current study investigates the added value of machine learning to linear regression in predicting anastomotic leakage and pulmonary complications after upper gastrointestinal cancer surgery. Methods: All patients in the Dutch Upper Gastrointestinal Cancer Audit undergoing curatively intended esophageal or gastric cancer surgeries from 2011 to 2017 were included. Anastomotic leakage was defined as any clinically or radiologically proven anastomotic leakage. Pulmonary complications entailed: pneumonia, pleural effusion, respiratory failure, pneumothorax, and/or acute respiratory distress syndrome. Different machine learning models were tested. Nomograms were constructed using Least Absolute Shrinkage and Selection Operator. Results: Between 2011 and 2017, 4228 patients underwent surgical resection for esophageal cancer, of which 18% developed anastomotic leakage and 30% a pulmonary complication. Of the 2199 patients with surgical resection for gastric cancer, 7% developed anastomotic leakage and 15% a pulmonary complication. In all cases, linear regression had the highest predictive value with the area under the curves varying between 61.9 and 68.0, but the difference with machine learning models did not reach statistical significance. Conclusion: Machine learning models can predict postoperative complications in upper gastrointestinal cancer surgery, but they do not outperform the current gold standard, linear regression Show less
Background: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice.... Show moreBackground: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice. We implemented a CDSS to aid general practitioners (GPs) in treating patients with urinary tract infections (UTIs), which are a significant health burden worldwide. Objective: This study aims to prospectively assess the impact of this CDSS on treatment success and change in antibiotic prescription behavior of the physician. In doing so, we hope to identify drivers and obstacles that positively impact the quality of health care practice with ML. Methods: The CDSS was developed by Pacmed, Nivel, and Leiden University Medical Center (LUMC). The CDSS presents the expected outcomes of treatments, using interpretable decision trees as ML classifiers. Treatment success was defined as a subsequent period of 28 days during which no new antibiotic treatment for UTI was needed. In this prospective observational study, 36 primary care practices used the software for 4 months. Furthermore, 29 control practices were identified using propensity score-matching. All analyses were performed using electronic health records from the Nivel Primary Care Database. Patients for whom the software was used were identified in the Nivel database by sequential matching using CDSS use data. We compared the proportion of successful treatments before and during the study within the treatment arm. The same analysis was performed for the control practices and the patient subgroup the software was definitely used for. All analyses, including that of physicians' prescription behavior, were statistically tested using 2-sided z tests with an alpha level of .05. Results: In the treatment practices, 4998 observations were included before and 3422 observations (of 2423 unique patients) were included during the implementation period. In the control practices, 5044 observations were included before and 3360 observations were included during the implementation period. The proportion of successful treatments increased significantly from 75% to 80% in treatment practices (z=5.47, P<.001). No significant difference was detected in control practices (76% before and 76% during the pilot, z=0.02; P=.98). Of the 2423 patients, we identified 734 (30.29%) in the CDSS use database in the Nivel database. For these patients, the proportion of successful treatments during the study was 83%-a statistically significant difference, with 75% of successful treatments before the study in the treatment practices (z=4.95; P<.001). Conclusions: The introduction of the CDSS as an intervention in the 36 treatment practices was associated with a statistically significant improvement in treatment success. We excluded temporal effects and validated the results with the subgroup analysis in patients for whom we were certain that the software was used. This study shows important strengths and points of attention for the development and implementation of an ML-based CDSS in clinical practice. Trial Registration: ClinicalTrials.gov NCT04408976; https://clinicaltrials.gov/ct2/show/NCT04408976 Show less
Fairness and bias are crucial concepts in artificial intelligence, yet they are relatively ignored in machine learning applications in clinical psychiatry. We computed fairness metrics and present... Show moreFairness and bias are crucial concepts in artificial intelligence, yet they are relatively ignored in machine learning applications in clinical psychiatry. We computed fairness metrics and present bias mitigation strategies using a model trained on clinical mental health data. We collected structured data related to the admission, diagnosis, and treatment of patients in the psychiatry department of the University Medical Center Utrecht. We trained a machine learning model to predict future administrations of benzodiazepines on the basis of past data. We found that gender plays an unexpected role in the predictions-this constitutes bias. Using the AI Fairness 360 package, we implemented reweighing and discrimination-aware regularization as bias mitigation strategies, and we explored their implications for model performance. This is the first application of bias exploration and mitigation in a machine learning model trained on real clinical psychiatry data. Show less
Background: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice.... Show moreBackground: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice. We implemented a CDSS to aid general practitioners (GPs) in treating patients with urinary tract infections (UTIs), which are a significant health burden worldwide.Objective: This study aims to prospectively assess the impact of this CDSS on treatment success and change in antibiotic prescription behavior of the physician. In doing so, we hope to identify drivers and obstacles that positively impact the quality of health care practice with ML.Methods: The CDSS was developed by Pacmed, Nivel, and Leiden University Medical Center (LUMC). The CDSS presents the expected outcomes of treatments, using interpretable decision trees as ML classifiers. Treatment success was defined as a subsequent period of 28 days during which no new antibiotic treatment for UTI was needed. In this prospective observational study, 36 primary care practices used the software for 4 months. Furthermore, 29 control practices were identified using propensity score-matching. All analyses were performed using electronic health records from the Nivel Primary Care Database. Patients for whom the software was used were identified in the Nivel database by sequential matching using CDSS use data. We compared the proportion of successful treatments before and during the study within the treatment arm. The same analysis was performed for the control practices and the patient subgroup the software was definitely used for. All analyses, including that of physicians’ prescription behavior, were statistically tested using 2-sided z tests with an α level of .05.Results: In the treatment practices, 4998 observations were included before and 3422 observations (of 2423 unique patients) were included during the implementation period. In the control practices, 5044 observations were included before and 3360 observations were included during the implementation period. The proportion of successful treatments increased significantly from 75% to 80% in treatment practices (z=5.47, P<.001). No significant difference was detected in control practices (76% before and 76% during the pilot, z=0.02; P=.98). Of the 2423 patients, we identified 734 (30.29%) in the CDSS use database in the Nivel database. For these patients, the proportion of successful treatments during the study was 83%—a statistically significant difference, with 75% of successful treatments before the study in the treatment practices (z=4.95; P<.001).Conclusions: The introduction of the CDSS as an intervention in the 36 treatment practices was associated with a statistically significant improvement in treatment success. We excluded temporal effects and validated the results with the subgroup analysis in patients for whom we were certain that the software was used. This study shows important strengths and points of attention for the development and implementation of an ML-based CDSS in clinical practice. Show less
Background: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice.... Show moreBackground: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice. We implemented a CDSS to aid general practitioners (GPs) in treating patients with urinary tract infections (UTIs), which are a significant health burden worldwide.Objective: This study aims to prospectively assess the impact of this CDSS on treatment success and change in antibiotic prescription behavior of the physician. In doing so, we hope to identify drivers and obstacles that positively impact the quality of health care practice with ML.Methods: The CDSS was developed by Pacmed, Nivel, and Leiden University Medical Center (LUMC). The CDSS presents the expected outcomes of treatments, using interpretable decision trees as ML classifiers. Treatment success was defined as a subsequent period of 28 days during which no new antibiotic treatment for UTI was needed. In this prospective observational study, 36 primary care practices used the software for 4 months. Furthermore, 29 control practices were identified using propensity score-matching. All analyses were performed using electronic health records from the Nivel Primary Care Database. Patients for whom the software was used were identified in the Nivel database by sequential matching using CDSS use data. We compared the proportion of successful treatments before and during the study within the treatment arm. The same analysis was performed for the control practices and the patient subgroup the software was definitely used for. All analyses, including that of physicians’ prescription behavior, were statistically tested using 2-sided z tests with an α level of .05.Results: In the treatment practices, 4998 observations were included before and 3422 observations (of 2423 unique patients) were included during the implementation period. In the control practices, 5044 observations were included before and 3360 observations were included during the implementation period. The proportion of successful treatments increased significantly from 75% to 80% in treatment practices (z=5.47, P<.001). No significant difference was detected in control practices (76% before and 76% during the pilot, z=0.02; P=.98). Of the 2423 patients, we identified 734 (30.29%) in the CDSS use database in the Nivel database. For these patients, the proportion of successful treatments during the study was 83%—a statistically significant difference, with 75% of successful treatments before the study in the treatment practices (z=4.95; P<.001).Conclusions: The introduction of the CDSS as an intervention in the 36 treatment practices was associated with a statistically significant improvement in treatment success. We excluded temporal effects and validated the results with the subgroup analysis in patients for whom we were certain that the software was used. This study shows important strengths and points of attention for the development and implementation of an ML-based CDSS in clinical practice. Show less
Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a... Show moreMulti-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the data has a hierarchical multi-view structure. We also introduce a new view importance measure for StaPLR, which allows us to compare the importance of views at any level of the hierarchy. We apply our extended StaPLR algorithm to Alzheimer's disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which derived MRI measures are most important for classification, and it outperforms elastic net regression in classification performance. Show less
Objective:To perform a scoping review of imaging-based machine-learning models to predict clinical outcomes and identify biomarkers in patients with PDAC.Summary of Background Data:Patients with... Show moreObjective:To perform a scoping review of imaging-based machine-learning models to predict clinical outcomes and identify biomarkers in patients with PDAC.Summary of Background Data:Patients with PDAC could benefit from better selection for systemic and surgical therapy. Imaging-based machine-learning models may improve treatment selection.Methods:A scoping review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses-scoping review guidelines in the PubMed and Embase databases (inception-October 2020). The review protocol was prospectively registered (open science framework registration: m4cyx). Included were studies on imaging-based machine-learning models for predicting clinical outcomes and identifying biomarkers for PDAC. The primary outcome was model performance. An area under the curve (AUC) of >= 0.75, or a P-value of <= 0.05, was considered adequate model performance. Methodological study quality was assessed using the modified radiomics quality score.Results:After screening 1619 studies, 25 studies with 2305 patients fulfilled the eligibility criteria. All but 1 study was published in 2019 and 2020. Overall, 23/25 studies created models using radiomics features, 1 study quantified vascular invasion on computed tomography, and one used histopathological data. Nine models predicted clinical outcomes with AUC measures of 0.78-0.95, and C-indices of 0.65-0.76. Seventeen models identified biomarkers with AUC measures of 0.68-0.95. Adequate model performance was reported in 23/25 studies. The methodological quality of the included studies was suboptimal, with a median modified radiomics quality score score of 7/36.Conclusions:The use of imaging-based machine-learning models to predict clinical outcomes and identify biomarkers in patients with PDAC is increasingly rapidly. Although these models mostly have good performance scores, their methodological quality should be improved. Show less
Bokma, W.A.; Zhutovsky, P.; Giltay, E.J.; Schoevers, R.A.; Penninx, B.W.J.H.; Balkom, A.L.J.M. van; ... ; Wingen, G.A. van 2022
Background Disease trajectories of patients with anxiety disorders are highly diverse and approximately 60% remain chronically ill. The ability to predict disease course in individual patients... Show moreBackground Disease trajectories of patients with anxiety disorders are highly diverse and approximately 60% remain chronically ill. The ability to predict disease course in individual patients would enable personalized management of these patients. This study aimed to predict recovery from anxiety disorders within 2 years applying a machine learning approach. Methods In total, 887 patients with anxiety disorders (panic disorder, generalized anxiety disorder, agoraphobia, or social phobia) were selected from a naturalistic cohort study. A wide array of baseline predictors (N = 569) from five domains (clinical, psychological, sociodemographic, biological, lifestyle) were used to predict recovery from anxiety disorders and recovery from all common mental disorders (CMDs: anxiety disorders, major depressive disorder, dysthymia, or alcohol dependency) at 2-year follow-up using random forest classifiers (RFCs). Results At follow-up, 484 patients (54.6%) had recovered from anxiety disorders. RFCs achieved a cross-validated area-under-the-receiving-operator-characteristic-curve (AUC) of 0.67 when using the combination of all predictor domains (sensitivity: 62.0%, specificity 62.8%) for predicting recovery from anxiety disorders. Classification of recovery from CMDs yielded an AUC of 0.70 (sensitivity: 64.6%, specificity: 62.3%) when using all domains. In both cases, the clinical domain alone provided comparable performances. Feature analysis showed that prediction of recovery from anxiety disorders was primarily driven by anxiety features, whereas recovery from CMDs was primarily driven by depression features. Conclusions The current study showed moderate performance in predicting recovery from anxiety disorders over a 2-year follow-up for individual patients and indicates that anxiety features are most indicative for anxiety improvement and depression features for improvement in general. Show less
Gitto, S.; Cuocolo, R.; Langevelde, K. van; Sande, M.A.J. van de; Parafioriti, A.; Luzzati, A.; ... ; Bloem, J.L. 2022
Background: Atypical cartilaginous tumour (ACT) and grade II chondrosarcoma (CS2) of long bones are respectively managed with watchful waiting or curettage and wide resection. Preoperatively,... Show moreBackground: Atypical cartilaginous tumour (ACT) and grade II chondrosarcoma (CS2) of long bones are respectively managed with watchful waiting or curettage and wide resection. Preoperatively, imaging diagnosis can be challenging due to interobserver variability and biopsy suffers from sample errors. The aim of this study is to determine diagnostic performance of MRI radiomics-based machine learning in differentiating ACT from CS2 of long bones. Methods: One-hundred-fifty-eight patients with surgically treated and histology-proven cartilaginous bone tumours were retrospectively included at two tertiary bone tumour centres. The training cohort consisted of 93 MRI scans from centre 1 (n=74 ACT; n=19 CS2). The external test cohort consisted of 65 MRI scans from centre 2 (n=45 ACT; n=20 CS2). Bidimensional segmentation was performed on T1-weighted MRI. Radiomic features were extracted. After dimensionality reduction and class balancing in centre 1, a machine-learning classifier (Extra Trees Classifier) was tuned on the training cohort using 10-fold cross-validation and tested on the external test cohort. In centre 2, its performance was compared with an experienced musculoskeletal oncology radiologist using McNemar's test. Findings: After tuning on the training cohort (AUC=0.88), the machine-learning classifier had 92% accuracy (60/ 65, AUC=0.94) in identifying the lesions in the external test cohort. Its accuracies in correctly classifying ACT and CS2 were 98% (44/45) and 80% (16/20), respectively. The radiologist had 98% accuracy (64/65) with no difference compared to the classifier (p=0.134). Interpretation: Machine learning showed high accuracy in classifying ACT and CS2 of long bones based on MRI radiomic features. Copyright (C) 2021 The Authors. Published by Elsevier B.V. Show less
Purpose: Narcolepsy type-1 (NT1) is a rare chronic neurological sleep disorder with excessive daytime sleepiness (EDS) as usual first and cataplexy as pathognomonic symptom. Shortening the NT1... Show morePurpose: Narcolepsy type-1 (NT1) is a rare chronic neurological sleep disorder with excessive daytime sleepiness (EDS) as usual first and cataplexy as pathognomonic symptom. Shortening the NT1 diagnostic delay is the key to reduce disease burden and related low quality of life. Here we investigated the changes of diagnostic delay over the diagnostic years (1990-2018) and the factors associated with the delay in Europe. Patients and Methods: We analyzed 580 NT1 patients (male: 325, female: 255) from 12 European countries using the European Narcolepsy Network database. We combined machine learning and linear mixed-effect regression to identify factors associated with the delay. Results: The mean age at EDS onset and diagnosis of our patients was 20.9 +/- 11.8 (mean +/- standard deviation) and 30.5 +/- 14.9 years old, respectively. Their mean and median diagnostic delay was 9.7 +/- 11.5 and 5.3 (interquartile range: 1.7-13.2 years) years, respectively. We did not find significant differences in the diagnostic delay over years in either the whole dataset or in individual countries, although the delay showed significant differences in various countries. The number of patients with short (<= 2-year) and long (>= 13-year) diagnostic delay equally increased over decades, suggesting that subgroups of NT1 patients with variable disease progression may co-exist. Younger age at cataplexy onset, longer interval between EDS and cataplexy onsets, lower cataplexy frequency, shorter duration of irresistible daytime sleep, lower daytime REM sleep propensity, and being female are associated with longer diagnostic delay. Conclusion: Our findings contrast the results of previous studies reporting shorter delay over time which is confounded by calendar year, because they characterized the changes in diagnostic delay over the symptom onset year. Our study indicates that new strategies such as increasing media attention/awareness and developing new biomarkers are needed to better detect EDS, cataplexy, and changes of nocturnal sleep in narcolepsy, in order to shorten the diagnostic interval. Show less