Introduction: Cardiac magnetic resonance (CMR) is of diagnostic and prognostic value in a range of cardiopulmonary conditions. Current methods for evaluating CMR studies are laborious and time... Show moreIntroduction: Cardiac magnetic resonance (CMR) is of diagnostic and prognostic value in a range of cardiopulmonary conditions. Current methods for evaluating CMR studies are laborious and time-consuming, contributing to delays for patients. As the demand for CMR increases, there is a growing need to automate this process. The application of artificial intelligence (AI) to CMR is promising, but the evaluation of these tools in clinical practice has been limited. This study assessed the clinical viability of an automatic tool for measuring cardiac volumes on CMR. Methods: Consecutive patients who underwent CMR for any indication between January 2022 and October 2022 at a single tertiary centre were included prospectively. For each case, short-axis CMR images were segmented by the AI tool and manually to yield volume, mass and ejection fraction measurements for both ventricles. Automated and manual measurements were compared for agreement and the quality of the automated contours was assessed visually by cardiac radiologists. Results: 462 CMR studies were included. No statistically significant difference was demonstrated between any automated and manual measurements (p > 0.05; independent T-test). Intraclass correlation coefficient and BlandAltman analysis showed excellent agreement across all metrics (ICC > 0.85). The automated contours were evaluated visually in 251 cases, with agreement or minor disagreement in 229 cases (91.2%) and failed segmentation in only a single case (0.4%). The AI tool was able to provide automated contours in under 90 s. Conclusions: Automated segmentation of both ventricles on CMR by an automatic tool shows excellent agreement with manual segmentation performed by CMR experts in a retrospective real-world clinical cohort. Implementation of the tool could improve the efficiency of CMR reporting and reduce delays between imaging and diagnosis. Show less
Background: Digital triage tools for sexually transmitted infection (STI) testing can potentially be used as a substitute for the triage that general practitioners (GPs) perform to lower their work... Show moreBackground: Digital triage tools for sexually transmitted infection (STI) testing can potentially be used as a substitute for the triage that general practitioners (GPs) perform to lower their work pressure. The studied tool is based on medical guidelines. The same guidelines support GPs' decision-making process. However, research has shown that GPs make decisions from a holistic perspective and, therefore, do not always adhere to those guidelines. To have a high-quality digital triage tool that results in an efficient care process, it is important to learn more about GPs' decision-making process. Objective: The first objective was to identify whether the advice of the studied digital triage tool aligned with GPs' daily medical practice. The second objective was to learn which factors influence GPs' decisions regarding referral for diagnostic testing. In addition, this study provides insights into GPs' decision-making process. Methods: A qualitative vignette-based study using semistructured interviews was conducted. In total, 6 vignettes representing patient cases were discussed with the participants (GPs). The participants needed to think aloud whether they would advise an STI test for the patient and why. A thematic analysis was conducted on the transcripts of the interviews. The vignette patient cases were also passed through the digital triage tool, resulting in advice to test or not for an STI. A comparison was made between the advice of the tool and that of the participants. Results: In total, 10 interviews were conducted. Participants (GPs) had a mean age of 48.30 (SD 11.88) years. For 3 vignettes, the advice of the digital triage tool and of all participants was the same. In those vignettes, the patients' risk factors were sufficiently clear for the participants to advise the same as the digital tool. For 3 vignettes, the advice of the digital tool differed from that of the participants. Patient-related factors that influenced the participants' decision-making process were the patient's anxiety, young age, and willingness to be tested. Participants would test at a lower threshold than the triage tool because of those factors. Sometimes, participants wanted more information than was provided in the vignette or would like to conduct a physical examination. These elements were not part of the digital triage tool. Conclusions: The advice to conduct a diagnostic STI test differed between a digital triage tool and GPs. The digital triage tool considered only medical guidelines, whereas GPs were open to discussion reasoning from a holistic perspective. The GPs' decision-making process was influenced by patients' anxiety, willingness to be tested, and age. On the basis of these results, we believe that the digital triage tool for STI testing could support GPs and even replace consultations in the future. Further research must substantiate how this can be done safely. Show less
To what extent can the model of explanatory dialogue elucidate the interaction between the Doctor and diagnostic AI, i.e., Explainable Artificial intelligence (XAI)? We argue that explanatory... Show moreTo what extent can the model of explanatory dialogue elucidate the interaction between the Doctor and diagnostic AI, i.e., Explainable Artificial intelligence (XAI)? We argue that explanatory dialogue models do not make an optimal framework for analyzing and evaluating the Doctor-XAI interaction. This is because this interaction, typically, does not aim at transferring understanding, a main goal of explanatory dialogue. Show less
Hesselmans, S.; Meiland, F.J.M.; Adam, E.; Cruijs, E. van de; Vonk, A.; Oost, F. van; ... ; Meinders, E.R. 2023
Purpose: People with intellectual disabilities often show challenging behaviour, which can manifest itself in self-harm or aggression towards others. Real-time monitoring of stress in clients with... Show morePurpose: People with intellectual disabilities often show challenging behaviour, which can manifest itself in self-harm or aggression towards others. Real-time monitoring of stress in clients with challenging behaviour can help caregivers to promptly deploy interventions to prevent escalations, ultimately to improve the quality of life of client and caregiver. This study aimed to assess the impact of real-time stress monitoring with HUME, and the subsequent interventions deployed by the care team, on stress levels and quality of life. Materials and methods: Real-time stress monitoring was used in 41 clients with intellectual disabilities in a long-term care setting over a period of six months. Stress levels were determined at the start and during the deployment of the stress monitoring system. The quality of life of the client and caregiver was measured with the Outcome Rating Scale at the start and at three months of use. Results: The results showed that the HUME-based interventions resulted in a stress reduction. The perceived quality of life was higher after three months for both the clients and caregivers. Furthermore, interventions to provide proximity were found to be most effective in reducing stress and increasing the client's quality of life. Conclusions: The study demonstrates that real-time stress monitoring with the HUME and the following interventions were effective. There was less stress in clients with an intellectual disability and an increase in the perceived quality of life. Future larger and randomized controlled studies are needed to confirm these findings. Show less
The Banff Digital Pathology Working Group (DPWG) was established with the goal to establish a digital pathology repository; develop, validate, and share models for image analysis; and foster... Show moreThe Banff Digital Pathology Working Group (DPWG) was established with the goal to establish a digital pathology repository; develop, validate, and share models for image analysis; and foster collaborations using regular videoconferencing. During the calls, a variety of artificial intelligence (AI)-based support systems for transplantation pathology were presented. Potential collaborations in a competition/trial on AI applied to kidney transplant specimens, including the DIAGGRAFT challenge (staining of biopsies at multiple institutions, pathologists' visual assessment, and development and validation of new and pre-existing Banff scoring algorithms), were also discussed. To determine the next steps, a survey was conducted, primarily focusing on the feasibility of establishing a digital pathology repository and identifying potential hosts. Sixteen of the 35 respondents (46%) had access to a server hosting a digital pathology repository, with 2 respondents that could serve as a potential host at no cost to the DPWG. The 16 digital pathology repositories collected specimens from various organs, with the largest constituent being kidney (n = 12,870 specimens). A DPWG pilot digital pathology repository was established, and there are plans for a competition/trial with the DIAGGRAFT project. Utilizing existing resources and previously established models, the Banff DPWG is establishing new resources for the Banff community. Show less
Vries, S. de; Oost, F. van; Smaling, H.; Knegt, N. de; Cluitmans, P.; Smits, R.; Meinders, E. 2023
People with severe intellectual disabilities (ID) could have difficulty expressing their stress which may complicate timely responses from caregivers. The present study proposes an automatic... Show morePeople with severe intellectual disabilities (ID) could have difficulty expressing their stress which may complicate timely responses from caregivers. The present study proposes an automatic stress detection system that can work in real-time. The system uses wearable sensors that record physiological signals in combination with machine learning to detect physiological changes related to stress. Four experiments were conducted to assess if the system could detect stress in people with and without ID. Three experiments were conducted with people without ID (n = 14, n = 18, and n = 48), and one observational study was done with people with ID (n = 12). To analyze if the system could detect stress, the performance of random, general, and personalized models was evaluated. The mixed ANOVA found a significant effect for model type, F(2, 134) = 116.50, p < .001. Additionally, the post-hoc t-tests found that the personalized model for the group with ID performed better than the random model, t(11) = 9.05, p < .001. The findings suggest that the personalized model can detect stress in people with and without ID. A larger-scale study is required to validate the system for people with ID. Show less
ObjectiveValidation of automated 2-dimensional (2D) diameter measurements of vestibular schwannomas on magnetic resonance imaging (MRI). Study DesignRetrospective validation study using 2 data sets... Show moreObjectiveValidation of automated 2-dimensional (2D) diameter measurements of vestibular schwannomas on magnetic resonance imaging (MRI). Study DesignRetrospective validation study using 2 data sets containing MRIs of vestibular schwannoma patients. SettingUniversity Hospital in The Netherlands. MethodsTwo data sets were used, 1 containing 1 scan per patient (n = 134) and the other containing at least 3 consecutive MRIs of 51 patients, all with contrast-enhanced T1 or high-resolution T2 sequences. 2D measurements of the maximal extrameatal diameters in the axial plane were automatically derived from a 3D-convolutional neural network compared to manual measurements by 2 human observers. Intra- and interobserver variabilities were calculated using the intraclass correlation coefficient (ICC), agreement on tumor progression using Cohen's kappa. ResultsThe human intra- and interobserver variability showed a high correlation (ICC: 0.98-0.99) and limits of agreement of 1.7 to 2.1 mm. Comparing the automated to human measurements resulted in ICC of 0.98 (95% confidence interval [CI]: 0.974; 0.987) and 0.97 (95% CI: 0.968; 0.984), with limits of agreement of 2.2 and 2.1 mm for diameters parallel and perpendicular to the posterior side of the temporal bone, respectively. There was satisfactory agreement on tumor progression between automated measurements and human observers (Cohen's & kappa; = 0.77), better than the agreement between the human observers (Cohen's & kappa; = 0.74). ConclusionAutomated 2D diameter measurements and growth detection of vestibular schwannomas are at least as accurate as human 2D measurements. In clinical practice, measurements of the maximal extrameatal tumor (2D) diameters of vestibular schwannomas provide important complementary information to total tumor volume (3D) measurements. Combining both in an automated measurement algorithm facilitates clinical adoption. Show less
Objectives. Acute myocardial ischemia in the setting of acute coronary syndrome (ACS) may lead to myocardial infarction. Therefore, timely decisions, already in the pre-hospital phase, are crucial... Show moreObjectives. Acute myocardial ischemia in the setting of acute coronary syndrome (ACS) may lead to myocardial infarction. Therefore, timely decisions, already in the pre-hospital phase, are crucial to preserving cardiac function as much as possible. Serial electrocardiography, a comparison of the acute electrocardiogram with a previously recorded (reference) ECG of the same patient, aids in identifying ischemia-induced electrocardiographic changes by correcting for interindividual ECG variability. Recently, the combination of deep learning and serial electrocardiography provided promising results in detecting emerging cardiac diseases; thus, the aim of our current study is the application of our novel Advanced Repeated Structuring and Learning Procedure (AdvRS&LP), specifically designed for acute myocardial ischemia detection in the pre-hospital phase by using serial ECG features. Approach. Data belong to the SUBTRACT study, which includes 1425 ECG pairs, 194 (14%) ACS patients, and 1035 (73%) controls. Each ECG pair was characterized by 28 serial features that, with sex and age, constituted the inputs of the AdvRS&LP, an automatic constructive procedure for creating supervised neural networks (NN). We created 100 NNs to compensate for statistical fluctuations due to random data divisions of a limited dataset. We compared the performance of the obtained NNs to a logistic regression (LR) procedure and the Glasgow program (Uni-G) in terms of area-under-the-curve (AUC) of the receiver-operating-characteristic curve, sensitivity (SE), and specificity (SP). Main Results. NNs (median AUC = 83%, median SE = 77%, and median SP = 89%) presented a statistically (P value lower than 0.05) higher testing performance than those presented by LR (median AUC = 80%, median SE = 67%, and median SP = 81%) and by the Uni-G algorithm (median SE = 72% and median SP = 82%). Significance. In conclusion, the positive results underscore the value of serial ECG comparison in ischemia detection, and NNs created by AdvRS&LP seem to be reliable tools in terms of generalization and clinical applicability. Show less
The introduction of the tethered DROP-IN gamma probe has enabled targeted robot-assisted radioguided prostate cancer (PCa) resection of pelvic sentinel lymph nodes (SLNs) and prostate-specific... Show moreThe introduction of the tethered DROP-IN gamma probe has enabled targeted robot-assisted radioguided prostate cancer (PCa) resection of pelvic sentinel lymph nodes (SLNs) and prostate-specific membrane antigen (PSMA)-positive lesions. While both procedures use Tc-99m-isotopes, the two vary in signal and background intensity. To understand how the different levels of image guidance impact surgical decision-making, computer-vision algorithms are used to extract the DROP-IN probe kinematic form clinical videos. 44 PCa patients undergo SLN (25) and PSMA-targeted (19) resections. PSMA-PET/CT and SPECT/CT create preoperative roadmaps, and intraoperative probe signal intensities are recorded. Using neural network-based software, probe trajectories are extracted from videos to extract multiparametric kinematics and generate decision-making and dexterity scores. PSMA-targeted resections yield significantly lower nodal signal intensities in preoperative SPECT-CT scans (three-fold; p = 0.01), intraoperative probe readouts (eight-fold; p < 0.001), and signal-to-background ratios (SBR; two-fold; p < 0.001). Kinematics assessment reveal that the challenges encounter during PSMA-targeted procedures converted to longer target identification times and increase in probe pick-ups (both five-fold; p < 0.001). This results in a fourfold reduction in the decision-making score (p < 0.001). Reduced signal intensities and intraoperative SBR values negatively affect the impact that image-guided surgery strategies have on the surgical decision-making process. Show less
Bournez, C.; Riool, M.; Boer, L. de; Cordfunke, R.A.; Best, L. de; Leeuwen, R. van; ... ; Westen, G.J.P. van 2023
To combat infection by microorganisms host organisms possess a primary arsenal via the innate immune system. Among them are defense peptides with the ability to target a wide range of pathogenic... Show moreTo combat infection by microorganisms host organisms possess a primary arsenal via the innate immune system. Among them are defense peptides with the ability to target a wide range of pathogenic organisms, including bacteria, viruses, parasites, and fungi. Here, we present the development of a novel machine learning model capable of predicting the activity of antimicrobial peptides (AMPs), CalcAMP. AMPs, in particular short ones (<35 amino acids), can become an effective solution to face the multi-drug resistance issue arising worldwide. Whereas finding potent AMPs through classical wet-lab techniques is still a long and expensive process, a machine learning model can be useful to help researchers to rapidly identify whether peptides present potential or not. Our prediction model is based on a new data set constructed from the available public data on AMPs and experimental antimicrobial activities. CalcAMP can predict activity against both Gram-positive and Gram-negative bacteria. Different features either concerning general physicochemical properties or sequence composition have been assessed to retrieve higher prediction accuracy. CalcAMP can be used as an promising prediction asset to identify short AMPs among given peptide sequences. Show less
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine... Show moreIn recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research. Show less
Wal, I. van der; Meijer, F.; Fuica, R.; Silman, Z.; Boon, M.; Martini, C.; ... ; Gozal, Y. 2023
In this pooled analysis of two randomized clinical trials, intraoperative opioid dosing based on the nociception level-index produced less pain compared to standard care with a difference in pain... Show moreIn this pooled analysis of two randomized clinical trials, intraoperative opioid dosing based on the nociception level-index produced less pain compared to standard care with a difference in pain scores in the post-anesthesia care unit of 1.5 (95% CI 0.8-2.2) points on an 11-point scale. The proportion of patients with severe pain was lower by 70%. Severe postoperative pain remains a significant problem and associates with several adverse outcomes. Here, we determined whether the application of a monitor that detects intraoperative nociceptive events, based on machine learning technology, and treatment of such events reduces pain scores in the post-anesthesia care unit (PACU). To that end, we performed a pooled analysis of two trials in adult patients, undergoing elective major abdominal surgery, on the effect of intraoperative nociception level monitor (NOL)-guided fentanyl dosing on PACU pain was performed. Patients received NOL-guided fentanyl dosing or standard care (fentanyl dosing based on hemodynamic parameters). Goal of the intervention was to keep NOL at values that indicated absence of nociception. The primary endpoint of the study was the median pain score obtained in the first 90 min in the PACU. Pain scores were collected at 15 min intervals on an 11-point Likert scale. Data from 125 patients (55 men, 70 women, age range 21-86 years) were analyzed. Sixty-one patients received NOL-guided fentanyl dosing and 64 standard care. Median PACU pain score was 1.5 points (0.8-2.2) lower in the NOL group compared to the standard care; the proportion of patients with severe pain was 70% lower in the NOL group (p = 0.045). The only significant factor associated with increased odds for severe pain was the standard of care compared to NOL treatment (OR 6.0, 95% CI 1.4 -25.9, p = 0.017). The use of a machine learning-based technology to guide opioid dosing during major abdominal surgery resulted in reduced PACU pain scores with less patients in severe pain. Show less
Meijden, S.L. van der; Hond, A.A.H. de; Thoral, P.J.; Steyerberg, E.W.; Kant, I.M.J.; Cinà, G.; Arbous, M.S. 2023
Background: Artificial intelligence–based clinical decision support (AI-CDS) tools have great potential to benefit intensive care unit (ICU) patients and physicians. There is a gap between the... Show moreBackground: Artificial intelligence–based clinical decision support (AI-CDS) tools have great potential to benefit intensive care unit (ICU) patients and physicians. There is a gap between the development and implementation of these tools.Objective: We aimed to investigate physicians’ perspectives and their current decision-making behavior before implementing a discharge AI-CDS tool for predicting readmission and mortality risk after ICU discharge.Methods: We conducted a survey of physicians involved in decision-making on discharge of patients at two Dutch academic ICUs between July and November 2021. Questions were divided into four domains: (1) physicians’ current decision-making behavior with respect to discharging ICU patients, (2) perspectives on the use of AI-CDS tools in general, (3) willingness to incorporate a discharge AI-CDS tool into daily clinical practice, and (4) preferences for using a discharge AI-CDS tool in daily workflows.Results: Most of the 64 respondents (of 93 contacted, 69%) were familiar with AI (62/64, 97%) and had positive expectations of AI, with 55 of 64 (86%) believing that AI could support them in their work as a physician. The respondents disagreed on whether the decision to discharge a patient was complex (23/64, 36% agreed and 22/64, 34% disagreed); nonetheless, most (59/64, 92%) agreed that a discharge AI-CDS tool could be of value. Significant differences were observed between physicians from the 2 academic sites, which may be related to different levels of involvement in the development of the discharge AI-CDS tool.Conclusions: ICU physicians showed a favorable attitude toward the integration of AI-CDS tools into the ICU setting in general, and in particular toward a tool to predict a patient’s risk of readmission and mortality within 7 days after discharge. The findings of this questionnaire will be used to improve the implementation process and training of end users. Show less
Background and Objectives: Interest in artificial intelligence (AI) for outcome prediction has grown substantially in recent years. However, the prognostic role of AI using advanced cardiac... Show moreBackground and Objectives: Interest in artificial intelligence (AI) for outcome prediction has grown substantially in recent years. However, the prognostic role of AI using advanced cardiac magnetic resonance imaging (CMR) remains unclear. This systematic review assesses the existing literature on AI in CMR to predict outcomes in patients with cardiovascular disease. Materials and Methods: Medline and Embase were searched for studies published up to November 2021. Any study assessing outcome prediction using AI in CMR in patients with cardiovascular disease was eligible for inclusion. All studies were assessed for compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Results: A total of 5 studies were included, with a total of 3679 patients, with 225 deaths and 265 major adverse cardiovascular events. Three methods demonstrated high prognostic accuracy: (1) three-dimensional motion assessment model in pulmonary hypertension (hazard ratio (HR) 2.74, 95%CI 1.73-4.34, p < 0.001), (2) automated perfusion quantification in patients with coronary artery disease (HR 2.14, 95%CI 1.58-2.90, p < 0.001), and (3) automated volumetric, functional, and area assessment in patients with myocardial infarction (HR 0.94, 95%CI 0.92-0.96, p < 0.001). Conclusion: There is emerging evidence of the prognostic role of AI in predicting outcomes for three-dimensional motion assessment in pulmonary hypertension, ischaemia assessment by automated perfusion quantification, and automated functional assessment in myocardial infarction. Show less
Background: There has been a rapid increase in the number of Artificial Intelligence (AI) studies of cardiac MRI (CMR) segmentation aiming to automate image analysis. However, advancement and... Show moreBackground: There has been a rapid increase in the number of Artificial Intelligence (AI) studies of cardiac MRI (CMR) segmentation aiming to automate image analysis. However, advancement and clinical translation in this field depend on researchers presenting their work in a transparent and reproducible manner. This systematic review aimed to evaluate the quality of reporting in AI studies involving CMR segmentation. Methods: MEDLINE and EMBASE were searched for AI CMR segmentation studies in April 2022. Any fully automated AI method for segmentation of cardiac chambers, myocardium or scar on CMR was considered for inclusion. For each study, compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) was assessed. The CLAIM criteria were grouped into study, dataset, model and performance description domains. Results: 209 studies published between 2012 and 2022 were included in the analysis. Studies were mainly published in technical journals (58%), with the majority (57%) published since 2019. Studies were from 37 different countries, with most from China (26%), the United States (18%) and the United Kingdom (11%). Short axis CMR images were most frequently used (70%), with the left ventricle the most commonly segmented cardiac structure (49%). Median compliance of studies with CLAIM was 67% (IQR 59-73%). Median compliance was highest for the model description domain (100%, IQR 80-100%) and lower for the study (71%, IQR 63-86%), dataset (63%, IQR 50-67%) and performance (60%, IQR 50-70%) description domains. Conclusion: This systematic review highlights important gaps in the literature of CMR studies using AI. We identified key items missing-most strikingly poor description of patients included in the training and validation of AI models and inadequate model failure analysis-that limit the transparency, reproducibility and hence validity of published AI studies. This review may support closer adherence to established frameworks for reporting standards and presents recommendations for improving the quality of reporting in this field. Show less
For many parasitic diseases, the microscopic examination of clinical samples such as urine and stool still serves as the diagnostic reference standard, primarily because microscopes are accessible... Show moreFor many parasitic diseases, the microscopic examination of clinical samples such as urine and stool still serves as the diagnostic reference standard, primarily because microscopes are accessible and cost-effective. However, conventional microscopy is laborious, requires highly skilled personnel, and is highly subjective. Requirements for skilled operators, coupled with the cost and maintenance needs of the microscopes, which is hardly done in endemic countries, presents grossly limited access to the diagnosis of parasitic diseases in resource-limited settings. The urgent requirement for the management of tropical diseases such as schistosomiasis, which is now focused on elimination, has underscored the critical need for the creation of access to easy-to-use diagnosis for case detection, community mapping, and surveillance. In this paper, we present a low-cost automated digital microscope-the Schistoscope-which is capable of automatic focusing and scanning regions of interest in prepared microscope slides, and automatic detection of Schistosoma haematobium eggs in captured images. The device was developed using widely accessible distributed manufacturing methods and off-the-shelf components to enable local manufacturability and ease of maintenance. For proof of principle, we created a Schistosoma haematobium egg dataset of over 5000 images captured from spiked and clinical urine samples from field settings and demonstrated the automatic detection of Schistosoma haematobium eggs using a trained deep neural network model. The experiments and results presented in this paper collectively illustrate the robustness, stability, and optical performance of the device, making it suitable for use in the monitoring and evaluation of schistosomiasis control programs in endemic settings. Show less
Fairness and bias are crucial concepts in artificial intelligence, yet they are relatively ignored in machine learning applications in clinical psychiatry. We computed fairness metrics and present... Show moreFairness and bias are crucial concepts in artificial intelligence, yet they are relatively ignored in machine learning applications in clinical psychiatry. We computed fairness metrics and present bias mitigation strategies using a model trained on clinical mental health data. We collected structured data related to the admission, diagnosis, and treatment of patients in the psychiatry department of the University Medical Center Utrecht. We trained a machine learning model to predict future administrations of benzodiazepines on the basis of past data. We found that gender plays an unexpected role in the predictions-this constitutes bias. Using the AI Fairness 360 package, we implemented reweighing and discrimination-aware regularization as bias mitigation strategies, and we explored their implications for model performance. This is the first application of bias exploration and mitigation in a machine learning model trained on real clinical psychiatry data. Show less
For many parasitic diseases, the microscopic examination of clinical samples such as urine and stool still serves as the diagnostic reference standard, primarily because microscopes are accessible... Show moreFor many parasitic diseases, the microscopic examination of clinical samples such as urine and stool still serves as the diagnostic reference standard, primarily because microscopes are accessible and cost-effective. However, conventional microscopy is laborious, requires highly skilled personnel, and is highly subjective. Requirements for skilled operators, coupled with the cost and maintenance needs of the microscopes, which is hardly done in endemic countries, presents grossly limited access to the diagnosis of parasitic diseases in resource-limited settings. The urgent requirement for the management of tropical diseases such as schistosomiasis, which is now focused on elimination, has underscored the critical need for the creation of access to easy-to-use diagnosis for case detection, community mapping, and surveillance. In this paper, we present a low-cost automated digital microscope-the Schistoscope-which is capable of automatic focusing and scanning regions of interest in prepared microscope slides, and automatic detection of Schistosoma haematobium eggs in captured images. The device was developed using widely accessible distributed manufacturing methods and off-the-shelf components to enable local manufacturability and ease of maintenance. For proof of principle, we created a Schistosoma haematobium egg dataset of over 5000 images captured from spiked and clinical urine samples from field settings and demonstrated the automatic detection of Schistosoma haematobium eggs using a trained deep neural network model. The experiments and results presented in this paper collectively illustrate the robustness, stability, and optical performance of the device, making it suitable for use in the monitoring and evaluation of schistosomiasis control programs in endemic settings. Show less
Background: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice.... Show moreBackground: There is increasing attention on machine learning (ML)-based clinical decision support systems (CDSS), but their added value and pitfalls are very rarely evaluated in clinical practice. We implemented a CDSS to aid general practitioners (GPs) in treating patients with urinary tract infections (UTIs), which are a significant health burden worldwide. Objective: This study aims to prospectively assess the impact of this CDSS on treatment success and change in antibiotic prescription behavior of the physician. In doing so, we hope to identify drivers and obstacles that positively impact the quality of health care practice with ML. Methods: The CDSS was developed by Pacmed, Nivel, and Leiden University Medical Center (LUMC). The CDSS presents the expected outcomes of treatments, using interpretable decision trees as ML classifiers. Treatment success was defined as a subsequent period of 28 days during which no new antibiotic treatment for UTI was needed. In this prospective observational study, 36 primary care practices used the software for 4 months. Furthermore, 29 control practices were identified using propensity score-matching. All analyses were performed using electronic health records from the Nivel Primary Care Database. Patients for whom the software was used were identified in the Nivel database by sequential matching using CDSS use data. We compared the proportion of successful treatments before and during the study within the treatment arm. The same analysis was performed for the control practices and the patient subgroup the software was definitely used for. All analyses, including that of physicians' prescription behavior, were statistically tested using 2-sided z tests with an alpha level of .05. Results: In the treatment practices, 4998 observations were included before and 3422 observations (of 2423 unique patients) were included during the implementation period. In the control practices, 5044 observations were included before and 3360 observations were included during the implementation period. The proportion of successful treatments increased significantly from 75% to 80% in treatment practices (z=5.47, P<.001). No significant difference was detected in control practices (76% before and 76% during the pilot, z=0.02; P=.98). Of the 2423 patients, we identified 734 (30.29%) in the CDSS use database in the Nivel database. For these patients, the proportion of successful treatments during the study was 83%-a statistically significant difference, with 75% of successful treatments before the study in the treatment practices (z=4.95; P<.001). Conclusions: The introduction of the CDSS as an intervention in the 36 treatment practices was associated with a statistically significant improvement in treatment success. We excluded temporal effects and validated the results with the subgroup analysis in patients for whom we were certain that the software was used. This study shows important strengths and points of attention for the development and implementation of an ML-based CDSS in clinical practice. Trial Registration: ClinicalTrials.gov NCT04408976; https://clinicaltrials.gov/ct2/show/NCT04408976 Show less