The objective is to assess the performance of seven semiautomatic and two fully automatic segmentation methods on [F-18]FDG PET/CT lymphoma images and evaluate their influence on tumor... Show moreThe objective is to assess the performance of seven semiautomatic and two fully automatic segmentation methods on [F-18]FDG PET/CT lymphoma images and evaluate their influence on tumor quantification. All lymphoma lesions identified in 65 whole-body [F-18]FDG PET/CT staging images were segmented by two experienced observers using manual and semiautomatic methods. Semiautomatic segmentation using absolute and relative thresholds, k-means and Bayesian clustering, and a self-adaptive configuration (SAC) of k-means and Bayesian was applied. Three state-of-the-art deep learning-based segmentations methods using a 3D U-Net architecture were also applied. One was semiautomatic and two were fully automatic, of which one is publicly available. Dice coefficient (DC) measured segmentation overlap, considering manual segmentation the ground truth. Lymphoma lesions were characterized by 31 features. Intraclass correlation coefficient (ICC) assessed features agreement between different segmentation methods. Nine hundred twenty [F-18]FDG-avid lesions were identified. The SAC Bayesian method achieved the highest median intra-observer DC (0.87). Inter-observers' DC was higher for SAC Bayesian than manual segmentation (0.94 vs 0.84, p < 0.001). Semiautomatic deep learning-based median DC was promising (0.83 (Obs1), 0.79 (Obs2)). Threshold-based methods and publicly available 3D U-Net gave poorer results (0.56 <= DC <= 0.68). Maximum, mean, and peak standardized uptake values, metabolic tumor volume, and total lesion glycolysis showed excellent agreement (ICC >= 0.92) between manual and SAC Bayesian segmentation methods. The SAC Bayesian classifier is more reproducible and produces similar lesion features compared to manual segmentation, giving the best concordant results of all other methods. Deep learning-based segmentation can achieve overall good segmentation results but failed in few patients impacting patients' clinical evaluation. Show less
Wolff, L.; Su, J.H.; Loon, D. van; Es, A. van; Doormaal, P.J. van; Majoie, C.; ... ; MR CLEAN Investigators 2022
Purpose Outcome of endovascular treatment in acute ischemic stroke patients is depending on the collateral circulation maintaining blood flow to the ischemic territory. We evaluated the inter-rater... Show morePurpose Outcome of endovascular treatment in acute ischemic stroke patients is depending on the collateral circulation maintaining blood flow to the ischemic territory. We evaluated the inter-rater reliability and accuracy of raters and an automated algorithm for assessing the collateral score (CS, range: 0-3) in acute ischemic stroke patients. Methods Baseline CTA scans with an intracranial anterior occlusion from the MR CLEAN study (n=500) were used. For each core lab CS, ten CTA scans with sufficient quality were randomly selected. After a training session in collateral scoring, all selected CTA scans were individually evaluated for a visual CS by three groups: 7 radiologists, 13 junior and 9 senior radiology residents. Two additional radiologists scored CS to be used as reference, with a third providing a CS to produce a 2 out of 3 consensus CS in case of disagreement. An automated algorithm was also used to compute CS. Inter-rater agreement was reported with intraclass correlation coefficient (ICC). Accuracy of visual and automated CS were calculated. Results 39 CTA scans were assessed (1 corrupt CTA-scan excluded). All groups showed a moderate ICC (0.689-0.780) in comparison to the reference standard. Overall human accuracy was 65 +/- 7% and increased to 88 +/- 5% for dichotomized CS (0-1, 2-3). Automated CS accuracy was 62%, and 90% for dichotomized CS. No significant difference in accuracy was found between groups with different levels of expertise. Conclusion After training, inter-rater reliability in collateral scoring was not influenced by experience. Automated CS performs similar to residents and radiologists in determining a collateral score. Show less
Background: There are several methods to quantify mitral regurgitation (MR) by cardiovascular magnetic resonance (CMR). The interoperability of these methods and their reproducibility remains... Show moreBackground: There are several methods to quantify mitral regurgitation (MR) by cardiovascular magnetic resonance (CMR). The interoperability of these methods and their reproducibility remains undetermined.Objective: To determine the agreement and reproducibility of different MR quantification methods by CMR across all aetiologies.Methods: Thirty-five patients with MR were recruited (primary MR = 12, secondary MR = 10 and MVR = 13). Patients underwent CMR, including cines and four-dimensional flow (4D flow). Four methods were evaluated: MRStandard (left ventricular stroke volume-aortic forward flow by phase contrast), MRLVRV (left ventricular stroke volume - right ventricular stroke volume), MRJet (direct jet quantification by 4D flow) and MRMVAV (mitral forward flow by 4D flow - aortic forward flow by 4D flow). For all cases and MR types, 520 MR volumes were recorded by these 4 methods for intra-/inter-observer tests.Results: In primary MR, MRMVAV and MRLVRV were comparable to MRStandard (P > 0.05). MRJet resulted in significantly higher MR volumes when compared to MRStandard (P < 0.05) In secondary MR and MVR cases, all methods were comparable. In intra-observer tests, MRMVAV demonstrated least bias with best limits of agreement (bias = -0.1 ml,-8 ml to 7.8 ml, P = 0.9) and best concordance correlation coefficient (CCC = 0.96, P < 0.01). In inter-observer tests, for primary MR and MVR, least bias and highest CCC were observed for MRMVAV. For secondary MR, bias was lowest for MRJet (-0.1 ml, P=NS).Conclusion: CMR methods of MR quantification demonstrate agreement in secondary MR and MVR. In primary MR, this was not observed. Across all types of MR, MRMVAV quantification demonstrated the highest reproducibility and consistency. (C) 2021 The Author(s). Published by Elsevier B.V. Show less
Objectives To compare lesion features extracted from F-18-FDG PET/CT images acquired on analog and digital scanners, on consecutive imaging data from the same subjects. Methods Whole-body F-18-FDG... Show moreObjectives To compare lesion features extracted from F-18-FDG PET/CT images acquired on analog and digital scanners, on consecutive imaging data from the same subjects. Methods Whole-body F-18-FDG PET/CT images from 55 oncological patients were acquired twice after a single F-18-FDG injection, with a digital and an analog PET/CT scanner, alternately. Twenty-nine subjects were examined first on the digital, and 26 first on the analog equipment. Image reconstruction was performed using manufacturer standard clinical protocols and protocols that fulfilled EARL1 specifications. Twenty-five features based on lesion standardized uptake value (SUV) and geometry were assessed. To compare these features, intraclass correlation coefficient (ICC), relative difference (RD), absolute value of RD (|RD|), and repeatability coefficient (RC) were used. Results In total, 323 F-18-FDG avid lesions were identified. High agreement (ICC > 0.75) was obtained for most of the lesion features pulled out from both scanners' imaging data, especially when reconstruction protocols fulfilled EARL1 specifications. For EARL1 reconstruction images, the features frequently used in clinics, SUVmax, SUVpeak, SUVmean, metabolic tumor volume, and total lesion glycolysis, reached an ICC of 0.92, 0.95, 0.87, 0.98, and 0.98, and a median RD (digital-analog) of 3%, 5%, 4%, - 3% and 1%, respectively. Using standard reconstruction protocols, the ICC were 0.84, 0.93, 0.80, 0.98, and 0.98, and the RD were 20%, 11%, 13%, - 7%, and 7%, respectively. Conclusion Under controlled acquisition and reconstruction parameters, most of the features studied can be used for research and clinical work. This is especially important for multicenter studies and patient follow-ups. Show less