This thesis looks at Artificial Intelligence (AI) and its potential to revolutionise the healthcare sector. The first part of this thesis focuses on the responsible development and validation of AI... Show moreThis thesis looks at Artificial Intelligence (AI) and its potential to revolutionise the healthcare sector. The first part of this thesis focuses on the responsible development and validation of AI-based clinical prediction algorithms, exploring the prime considerations in this process. The second part of this thesis addresses the opportunities for classical statistics and machine learning techniques for developing prediction algorithms. It also examines the performance, potential, and challenges of AI prediction algorithms for clinical practice. The conclusion states that cross-discipline collaboration, exchangeability of knowledge and results, and validation of AI for healthcare practice are essential for realising the potential of AI in healthcare. Show less
Hond, A.A.H. de; Shah, V.B.; Kant, I.M.J.; Calster, B. van; Steyerberg, E.W.; Hernandez-Boussard, T. 2023
The generalizability of predictive algorithms is of key relevance to application in clinical practice. We provide an overview of three types of generalizability, based on existing literature:... Show moreThe generalizability of predictive algorithms is of key relevance to application in clinical practice. We provide an overview of three types of generalizability, based on existing literature: temporal, geographical, and domain generalizability. These generalizability types are linked to their associated goals, methodology, and stakeholders. Show less
OBJECTIVES: Many machine learning (ML) models have been developed for application in the ICU, but few models have been subjected to external validation. The performance of these models in new... Show moreOBJECTIVES: Many machine learning (ML) models have been developed for application in the ICU, but few models have been subjected to external validation. The performance of these models in new settings therefore remains unknown. The objective of this study was to assess the performance of an existing decision support tool based on a ML model predicting readmission or death within 7 days after ICU discharge before, during, and after retraining and recalibration.DESIGN: A gradient boosted ML model was developed and validated on electronic health record data from 2004 to 2021. We performed an independent validation of this model on electronic health record data from 2011 to 2019 from a different tertiary care center.SETTING: Two ICUs in tertiary care centers in The Netherlands.PATIENTS: Adult patients who were admitted to the ICU and stayed for longer than 12 hours.INTERVENTIONS: None.MEASUREMENTS AND MAIN RESULTS: We assessed discrimination by area under the receiver operating characteristic curve (AUC) and calibration (slope and intercept). We retrained and recalibrated the original model and assessed performance via a temporal validation design. The final retrained model was cross-validated on all data from the new site. Readmission or death within 7 days after ICU discharge occurred in 577 of 10,052 ICU admissions (5.7%) at the new site. External validation revealed moderate discrimination with an AUC of 0.72 (95% CI 0.67–0.76). Retrained models showed improved discrimination with AUC 0.79 (95% CI 0.75–0.82) for the final validation model. Calibration was poor initially and good after recalibration via isotonic regression.CONCLUSIONS: In this era of expanding availability of ML models, external validation and retraining are key steps to consider before applying ML models to new settings. Clinicians and decision-makers should take this into account when considering applying new ML models to their local settings. Show less
Meijden, S.L. van der; Hond, A.A.H. de; Thoral, P.J.; Steyerberg, E.W.; Kant, I.M.J.; Cinà, G.; Arbous, M.S. 2023
Background: Artificial intelligence–based clinical decision support (AI-CDS) tools have great potential to benefit intensive care unit (ICU) patients and physicians. There is a gap between the... Show moreBackground: Artificial intelligence–based clinical decision support (AI-CDS) tools have great potential to benefit intensive care unit (ICU) patients and physicians. There is a gap between the development and implementation of these tools.Objective: We aimed to investigate physicians’ perspectives and their current decision-making behavior before implementing a discharge AI-CDS tool for predicting readmission and mortality risk after ICU discharge.Methods: We conducted a survey of physicians involved in decision-making on discharge of patients at two Dutch academic ICUs between July and November 2021. Questions were divided into four domains: (1) physicians’ current decision-making behavior with respect to discharging ICU patients, (2) perspectives on the use of AI-CDS tools in general, (3) willingness to incorporate a discharge AI-CDS tool into daily clinical practice, and (4) preferences for using a discharge AI-CDS tool in daily workflows.Results: Most of the 64 respondents (of 93 contacted, 69%) were familiar with AI (62/64, 97%) and had positive expectations of AI, with 55 of 64 (86%) believing that AI could support them in their work as a physician. The respondents disagreed on whether the decision to discharge a patient was complex (23/64, 36% agreed and 22/64, 34% disagreed); nonetheless, most (59/64, 92%) agreed that a discharge AI-CDS tool could be of value. Significant differences were observed between physicians from the 2 academic sites, which may be related to different levels of involvement in the development of the discharge AI-CDS tool.Conclusions: ICU physicians showed a favorable attitude toward the integration of AI-CDS tools into the ICU setting in general, and in particular toward a tool to predict a patient’s risk of readmission and mortality within 7 days after discharge. The findings of this questionnaire will be used to improve the implementation process and training of end users. Show less
Early detection of severe asthma exacerbations through home monitoring data in patients with stable mild-to-moderate chronic asthma could help to timely adjust medication. We evaluated the... Show moreEarly detection of severe asthma exacerbations through home monitoring data in patients with stable mild-to-moderate chronic asthma could help to timely adjust medication. We evaluated the potential of machine learning methods compared to a clinical rule and logistic regression to predict severe exacerbations. We used daily home monitoring data from two studies in asthma patients (development: n = 165 and validation: n = 101 patients). Two ML models (XGBoost, one class SVM) and a logistic regression model provided predictions based on peak expiratory flow and asthma symptoms. These models were compared with an asthma action plan rule. Severe exacerbations occurred in 0.2% of all daily measurements in the development (154/92,787 days) and validation cohorts (94/40,185 days). The AUC of the best performing XGBoost was 0.85 (0.82-0.87) and 0.88 (0.86-0.90) for logistic regression in the validation cohort. The XGBoost model provided overly extreme risk estimates, whereas the logistic regression underestimated predicted risks. Sensitivity and specificity were better overall for XGBoost and logistic regression compared to one class SVM and the clinical rule. We conclude that ML models did not beat logistic regression in predicting short-term severe asthma exacerbations based on home monitoring data. Clinical application remains challenging in settings with low event incidence and high false alarm rates with high sensitivity. Show less
While the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied... Show moreWhile the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied and disseminated in daily practice. This scoping review aimed to identify actionable guidance for those closely involved in AI-based prediction model (AIPM) development, evaluation and implementation including software engineers, data scientists, and healthcare professionals and to identify potential gaps in this guidance. We performed a scoping review of the relevant literature providing guidance or quality criteria regarding the development, evaluation, and implementation of AIPMs using a comprehensive multi-stage screening strategy. PubMed, Web of Science, and the ACM Digital Library were searched, and AI experts were consulted. Topics were extracted from the identified literature and summarized across the six phases at the core of this review: (1) data preparation, (2) AIPM development, (3) AIPM validation, (4) software development, (5) AIPM impact assessment, and (6) AIPM implementation into daily healthcare practice. From 2683 unique hits, 72 relevant guidance documents were identified. Substantial guidance was found for data preparation, AIPM development and AIPM validation (phases 1-3), while later phases clearly have received less attention (software development, impact assessment and implementation) in the scientific literature. The six phases of the AIPM development, evaluation and implementation cycle provide a framework for responsible introduction of AI-based prediction models in healthcare. Additional domain and technology specific research may be necessary and more practical experience with implementing AIPMs is needed to support further guidance. Show less
Wong, A.K.I.; Charpignon, M.; Kim, H.; Josef, C.; Hond, A.A.H. de; Fojas, J.J.; ... ; Celi, L.A. 2021
IMPORTANCE Discrepancies in oxygen saturation measured by pulse oximetry (Spo(2)), when compared with arterial oxygen saturation (Sao(2)) measured by arterial blood gas (ABG), may differentially... Show moreIMPORTANCE Discrepancies in oxygen saturation measured by pulse oximetry (Spo(2)), when compared with arterial oxygen saturation (Sao(2)) measured by arterial blood gas (ABG), may differentially affect patients according to race and ethnicity. However, the association of these disparities with health outcomes is unknown.OBJECTIVE To examine racial and ethnic discrepancies between Sao(2) and Spo(2) measures and their associations with clinical outcomes.DESIGN, SETTING, AND PARTICIPANTS This multicenter, retrospective, cross-sectional study included 3 publicly available electronic health record (EHR) databases (ie, the Electronic Intensive Care Unit-Clinical Research Database and Medical Information Mart for Intensive Care III and IV) as well as Emory Healthcare (2014-2021) and Grady Memorial (2014-2020) databases, spanning 215 hospitals and 382 ICUs. From 141 600 hospital encounters with recorded ABG measurements, 87 971 participants with first ABG measurements and an Spo(2) of at least 88% within 5 minutes before the ABG test were included.EXPOSURES Patients with hidden hypoxemia (ie, Spo(2) >= 88% but Sao(2) <88%).MAIN OUTCOMES AND MEASURES Outcomes, stratified by race and ethnicity, were Sao(2) for each Spo(2), hidden hypoxemia prevalence, initial demographic characteristics (age, sex), clinical outcomes (in-hospital mortality, length of stay), organ dysfunction by scores (Sequential Organ Failure Assessment [SOFA]), and laboratory values (lactate and creatinine levels) before and 24 hours after the ABG measurement.RESULTS The first Spo(2)-Sao(2) pairs from 87 971 patient encounters (27 713 [42.9%] women; mean [SE] age, 62.2 [17.0] years; 1919 [2.3%] Asian patients; 26 032 [29.6%] Black patients; 2397 [2.7%] Hispanic patients, and 57 632 [65.5%] White patients) were analyzed, with 4859 (5.5%) having hidden hypoxemia. Hidden hypoxemia was observed in all subgroups with varying incidence (Black: 1785 [6.8%]; Hispanic: 160 [6.0%]; Asian: 92 [4.8%]; White: 2822 [4.9%]) and was associated with greater organ dysfunction 24 hours after the ABG measurement, as evidenced by higher mean (SE) SOFA scores (7.2 [0.1] vs 6.29 [0.02]) and higher in-hospital mortality (eg, among Black patients: 369 [21.1%] vs 3557 [15.0%]; P < .001). Furthermore, patients with hidden hypoxemia had higher mean (SE) lactate levels before (3.15 [0.09] mg/dL vs 2.66 [0.02] mg/dL) and 24 hours after (2.83 [0.14] mg/dL vs 2.27 [0.02] mg/dL) the ABG test, with less lactate clearance (-0.54 [0.12] mg/dL vs -0.79 [0.03] mg/dL).CONCLUSIONS AND RELEVANCE In this study, there was greater variability in oxygen saturation levels for a given Spo(2) level in patients who self-identified as Black, followed by Hispanic, Asian, and White. Patients with and without hidden hypoxemia were demographically and clinically similar at baseline ABG measurement by SOFA scores, but those with hidden hypoxemia subsequently experienced higher organ dysfunction scores and higher in-hospital mortality. Show less