BackgroundMagnetic Resonance acquisition is a time consuming process, making it susceptible to patient motion during scanning. Even motion in the order of a millimeter can introduce severe blurring... Show moreBackgroundMagnetic Resonance acquisition is a time consuming process, making it susceptible to patient motion during scanning. Even motion in the order of a millimeter can introduce severe blurring and ghosting artifacts, potentially necessitating re-acquisition. Magnetic Resonance Imaging (MRI) can be accelerated by acquiring only a fraction of k-space, combined with advanced reconstruction techniques leveraging coil sensitivity profiles and prior knowledge. Artificial intelligence (AI)-based reconstruction techniques have recently been popularized, but generally assume an ideal setting without intra-scan motion.PurposeTo retrospectively detect and quantify the severity of motion artifacts in undersampled MRI data. This may prove valuable as a safety mechanism for AI-based approaches, provide useful information to the reconstruction method, or prompt for re-acquisition while the patient is still in the scanner.MethodsWe developed a deep learning approach that detects and quantifies motion artifacts in undersampled brain MRI. We demonstrate that synthetically motion-corrupted data can be leveraged to train the convolutional neural network (CNN)-based motion artifact estimator, generalizing well to real-world data. Additionally, we leverage the motion artifact estimator by using it as a selector for a motion-robust reconstruction model in case a considerable amount of motion was detected, and a high data consistency model otherwise.ResultsTraining and validation were performed on 4387 and 1304 synthetically motion-corrupted images and their uncorrupted counterparts, respectively. Testing was performed on undersampled in vivo motion-corrupted data from 28 volunteers, where our model distinguished head motion from motion-free scans with 91% and 96% accuracy when trained on synthetic and on real data, respectively. It predicted a manually defined quality label ('Good', 'Medium' or 'Bad' quality) correctly in 76% and 85% of the time when trained on synthetic and real data, respectively. When used as a selector it selected the appropriate reconstruction network 93% of the time, achieving near optimal SSIM values.ConclusionsThe proposed method quantified motion artifact severity in undersampled MRI data with high accuracy, enabling real-time motion artifact detection that can help improve the safety and quality of AI-based reconstructions. Show less
Purpose: Automated diagnosis of urogenital schistosomiasis using digital microscopy images of urine slides is an essential step toward the elimination of schistosomiasis as a disease of public... Show morePurpose: Automated diagnosis of urogenital schistosomiasis using digital microscopy images of urine slides is an essential step toward the elimination of schistosomiasis as a disease of public health concern in Sub-Saharan African countries. We create a robust image dataset of urine samples obtained from field settings and develop a two-stage diagnosis framework for urogenital schistosomiasis.Approach: Urine samples obtained from field settings were captured using the Schistoscope device, and S. haematobium eggs present in the images were manually annotated by experts to create the SH dataset. Next, we develop a two-stage diagnosis framework, which consists of semantic segmentation of S. haematobium eggs using the DeepLabv3-MobileNetV3 deep convolutional neural network and a refined segmentation step using ellipse fitting approach to approximate the eggs with an automatically determined number of ellipses. We defined two linear inequality constraints as a function of the overlap coefficient and area of a fitted ellipses. False positive diagnosis resulting from over-segmentation was further minimized using these constraints. We evaluated the performance of our framework on 7605 images from 65 independent urine samples collected from field settings in Nigeria, by deploying our algorithm on an Edge AI system consisting of Raspberry Pi + Coral USB accelerator.Result: The SH dataset contains 12,051 images from 103 independent urine samples and the developed urogenital schistosomiasis diagnosis framework achieved clinical sensitivity, specificity, and precision of 93.8%, 93.9%, and 93.8%, respectively, using results from an experienced microscopist as reference.Conclusion: Our detection framework is a promising tool for the diagnosis of urogenital schistosomiasis as our results meet the World Health Organization target product profile requirements for monitoring and evaluation of schistosomiasis control programs. (c) The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. Show less
Rational: Deep learning (DL) has demonstrated a remarkable performance in diagnostic imaging for various diseases and modalities and therefore has a high potential to be used as a clinical tool.... Show moreRational: Deep learning (DL) has demonstrated a remarkable performance in diagnostic imaging for various diseases and modalities and therefore has a high potential to be used as a clinical tool. However, current practice shows low deployment of these algorithms in clinical practice, because DL algorithms lack transparency and trust due to their underlying black-box mechanism. For successful employment, explainable artificial intelligence (XAI) could be introduced to close the gap between the medical professionals and the DL algorithms. In this literature review, XAI methods available for magnetic resonance (MR), computed tomography (CT), and positron emission tomography (PET) imaging are discussed and future suggestions are made.Methods: PubMed, and Clarivate Analytics/Web of Science Core Collection were screened. Articles were considered eligible for inclusion if XAI was used (and well described) to describe the behavior of a DL model used in MR, CT and PET imaging.Results: A total of 75 articles were included of which 54 and 17 articles described post and ad hoc XAI methods, respectively, and 4 articles described both XAI methods. Major variations in performance is seen between the methods. Overall, post hoc XAI lacks the ability to provide class-discriminative and target-specific explanation. Ad hoc XAI seems to tackle this because of its intrinsic ability to explain. However, quality control of the XAI methods is rarely applied and therefore systematic comparison between the methods is difficult.Conclusion: There is currently no clear consensus on how XAI should be deployed in order to close the gap between medical professionals and DL algorithms for clinical implementation. We advocate for systematic technical and clinical quality assessment of XAI methods. Also, to ensure end-to-end unbiased and safe integration of XAI in clinical workflow, (anatomical) data minimization and quality control methods should be included. Show less
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine... Show moreIn recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research. Show less
Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC)... Show moreData set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that occur between raw data and machine learning-ready data. Since predictive proteomics is an emerging field, when predicting peptide behavior in LC-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based data sets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides data sets that are useful for comparing state-of-the-art machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available at https://www.proteomicsml.org/, and we welcome the entire proteomics community to contribute to the project at https://github.com/ProteomicsML/ProteomicsML. Show less
Pulmonary function tests (PFTs) play an important role in screening and following-up pulmonary involvement in systemic sclerosis (SSc). However, some patients are not able to perform PFTs due to... Show morePulmonary function tests (PFTs) play an important role in screening and following-up pulmonary involvement in systemic sclerosis (SSc). However, some patients are not able to perform PFTs due to contraindications. In addition, it is unclear how lung function is affected by changes in lung structure in SSc. Therefore, this study aims to explore the potential of automatically estimating PFT results from chest CT scans of SSc patients and how different regions influence the estimation of PFTs. Deep regression networks were developed with transfer learning to estimate PFTs from 316 SSc patients. Segmented lungs and vessels were used to mask the CT images to train the network with different inputs: from entire CT scan, lungs-only to vessels-only. The network trained on entire CT scans with transfer learning achieved an ICC of 0.71, 0.76, 0.80, and 0.81 for the estimation of DLCO, FEV1, FVC and TLC, respectively. The performance of the networks gradually decreased when trained on data from lungs-only and vessels-only. Regression attention maps showed that regions close to large vessels were highlighted more than other regions, and occasionally regions outside the lungs were highlighted. These experiments show that apart from the lungs and large vessels, other regions contribute to PFT estimation. In addition, adding manually designed biomarkers increased the correlation (R) from 0.75, 0.74, 0.82, and 0.83 to 0.81, 0.83, 0.88, and 0.90, respectively. This suggests that that manually designed imaging biomarkers can still contribute to explaining the relation between lung function and structure. Show less
MR scans of low-gamma X-nuclei, low-concentration metabolites, or standard imaging at very low field entail a challenging tradeoff between resolution, signal-to-noise, and acquisition duration.... Show moreMR scans of low-gamma X-nuclei, low-concentration metabolites, or standard imaging at very low field entail a challenging tradeoff between resolution, signal-to-noise, and acquisition duration. Deep learning (DL) techniques, such as UNets, can potentially be used to improve such "low-quality" (LQ) images. We investigate three UNets for upscaling LQ MRI: dense (DUNet), robust (RUNet), and anisotropic (AUNet). These were evaluated for two acquisition scenarios. In the same-subject High-Quality Complementary Priors (HQCP) scenario, an LQ and a high quality (HQ) image are collected and both LQ and HQ were inputs to the UNets. In the No Complementary Priors (NoCP) scenario, only the LQ images are collected and used as the sole input to the UNets. To address the lack of same-subject LQ and HQ images, we added data from the OASIS-1 database. The UNets were tested in upscaling 1/8, 1/4, and 1/2 undersampled images for both scenarios. As manifested by non-statically significant differences of matrices, also supported by subjective observation, the three UNets upscaled images equally well. This was in contrast to mixed effects statistics that clearly illustrated significant differences. Observations suggest that the detailed architecture of these UNets may not play a critical role. As expected, HQCP substantially improves upscaling with any of the UNets. The outcomes support the notion that DL methods may have merit as an integral part of integrated holistic approaches in advancing special MRI acquisitions; however, primary attention should be paid to the foundational step of such approaches, i.e., the actual data collected. Show less
Thrombus volume in posterior circulation stroke (PCS) has been associated with outcome, through recanalization. Manual thrombus segmentation is impractical for large scale analysis of image... Show moreThrombus volume in posterior circulation stroke (PCS) has been associated with outcome, through recanalization. Manual thrombus segmentation is impractical for large scale analysis of image characteristics. Hence, in this study we develop the first automatic method for thrombus localization and segmentation on CT in patients with PCS. In this multi-center retrospective study, 187 patients with PCS from the MR CLEAN Registry were included. We developed a convolutional neural network (CNN) that segments thrombi and restricts the volume-of-interest (VOI) to the brainstem (Polar-UNet). Furthermore, we reduced false positive localization by removing small-volume objects, referred to as volume-based removal (VBR). Polar-UNet is benchmarked against a CNN that does not restrict the VOI (BL-UNet). Performance metrics included the intra-class correlation coefficient (ICC) between automated and manually segmented thrombus volumes, the thrombus localization precision and recall, and the Dice coefficient. The majority of the thrombi were localized. Without VBR, Polar-UNet achieved a thrombus localization recall of 0.82, versus 0.78 achieved by BL-UNet. This high recall was accompanied by a low precision of 0.14 and 0.09. VBR improved precision to 0.65 and 0.56 for Polar-UNet and BL-UNet, respectively, with a small reduction in recall to 0.75 and 0.69. The Dice coefficient achieved by Polar-UNet was 0.44, versus 0.38 achieved by BL-UNet with VBR. Both methods achieved ICCs of 0.41 (95% CI: 0.27-0.54). Restricting the VOI to the brainstem improved the thrombus localization precision, recall, and segmentation overlap compared to the benchmark. VBR improved thrombus localization precision but lowered recall. Show less
Ramos, L.A.; Os, H. van; Hilbert, A.; Olabarriaga, S.D.; Lugt, A. van der; Roos, Y.B.W.E.M.; ... ; Marquering, H.A. 2022
Background: Accurate prediction of clinical outcome is of utmost importance for choices regarding the endovascular treatment (EVT) of acute stroke. Recent studies on the prediction modeling for... Show moreBackground: Accurate prediction of clinical outcome is of utmost importance for choices regarding the endovascular treatment (EVT) of acute stroke. Recent studies on the prediction modeling for stroke focused mostly on clinical characteristics and radiological scores available at baseline. Radiological images are composed of millions of voxels, and a lot of information can be lost when representing this information by a single value. Therefore, in this study we aimed at developing prediction models that take into account the whole imaging data combined with clinical data available at baseline. Methods: We included 3,279 patients from the MR CLEAN Registry; a prospective, observational, multicenter registry of patients with ischemic stroke treated with EVT. We developed two approaches to combine the imaging data with the clinical data. The first approach was based on radiomics features, extracted from 70 atlas regions combined with the clinical data to train machine learning models. For the second approach, we trained 3D deep learning models using the whole images and the clinical data. Models trained with the clinical data only were compared with models trained with the combination of clinical and image data. Finally, we explored feature importance plots for the best models and identified many known variables and image features/brain regions that were relevant in the model decision process. Results: From 3,279 patients included, 1,241 (37%) patients had a good functional outcome [modified Rankin Scale (mRS) <= 2] and 1,954 (60%) patients had good reperfusion [modified Thrombolysis in Cerebral Infarction (eTICI) >= 2b]. There was no significant improvement by combining the image data to the clinical data for mRS prediction [mean area under the receiver operating characteristic (ROC) curve (AUC) of 0.81 vs. 0.80] above using the clinical data only, regardless of the approach used. Regarding predicting reperfusion, there was a significant improvement when image and clinical features were combined (mean AUC of 0.54 vs. 0.61), with the highest AUC obtained by the deep learning approach. Conclusions: The combination of radiomics and deep learning image features with clinical data significantly improved the prediction of good reperfusion. The visualization of prediction feature importance showed both known and novel clinical and imaging features with predictive values. Show less
Purpose: Parallel RF transmission (PTx) is one of the key technologies enabling high quality imaging at ultra-high fields (>= 7T). Compliance with regulatory limits on the local specific... Show morePurpose: Parallel RF transmission (PTx) is one of the key technologies enabling high quality imaging at ultra-high fields (>= 7T). Compliance with regulatory limits on the local specific absorption rate (SAR) typically involves over-conservative safety margins to account for intersubject variability, which negatively affect the utilization of ultra-high field MR. In this work, we present a method to generate a subject-specific body model from a single T1-weighted dataset for personalized local SAR prediction in PTx neuroimaging at 7T. Methods: Multi-contrast data were acquired at 7T (N = 10) to establish ground truth segmentations in eight tissue types. A 2.5D convolutional neural network was trained using the T1-weighted data as input in a leave-one-out cross-validation study. The segmentation accuracy was evaluated through local SAR simulations in a quadrature birdcage as well as a PTx coil model. Results: The network-generated segmentations reached Dice coefficients of 86.7% +/- 6.7% (mean +/- SD) and showed to successfully address the severe intensity bias and contrast variations typical to 7T. Errors in peak local SAR obtained were below 3.0% in the quadrature birdcage. Results obtained in the PTx configuration indicated that a safety margin of 6.3% ensures conservative local SAR estimates in 95% of the random RF shims, compared to an average overestimation of 34% in the generic "one-size-fits-all" approach. Conclusion: A subject-specific body model can be automatically generated from a single T1-weighted dataset by means of deep learning, providing the necessary inputs for accurate and personalized local SAR predictions in PTx neuroimaging at 7T. Show less
Yin, Z.; Geraedts, V.J.; Wang, Z.Q.; Contarino, M.F.; Dibeklioglu, H.; Gemert, J. van 2022
Parkinson's disease (PD) diagnosis is based on clinical criteria, i.e., bradykinesia, rest tremor, rigidity, etc. Assessment of the severity of PD symptoms with clinical rating scales, however, is... Show moreParkinson's disease (PD) diagnosis is based on clinical criteria, i.e., bradykinesia, rest tremor, rigidity, etc. Assessment of the severity of PD symptoms with clinical rating scales, however, is subject to inter-rater variability. In this paper, we propose a deep learning based automatic PD diagnosis method using videos to assist the diagnosis in clinical practices. We deploy a 3D Convolutional Neural Network (CNN) as the baseline approach for the PD severity classification and show the effectiveness. Due to the lack of data in clinical field, we explore the possibility of transfer learning from non-medical dataset and show that PD severity classification can benefit from it. To bridge the domain discrepancy between medical and non-medical datasets, we let the network focus more on the subtle temporal visual cues, i.e., the frequency of tremors, by designing a Temporal Self-Attention (TSA) mechanism. Seven tasks from the Movement Disorders Society - Unified PD rating scale (MDS-UPDRS) part III are investigated, which reveal the symptoms of bradykinesia and postural tremors. Furthermore, we propose a multi-domain learning method to predict the patient-level PD severity through task-assembling. We show the effectiveness of TSA and task-assembling method on our PD video dataset empirically. We achieve the best MCC of 0.55 on binary task-level and 0.39 on three-class patient-level classification. Show less
Spruit, M.; Verkleij, S.; Schepper, K. de; Scheepers, F. 2022
Diagnosing mental disorders is complex due to the genetic, environmental and psychological contributors and the individual risk factors. Language markers for mental disorders can help to diagnose a... Show moreDiagnosing mental disorders is complex due to the genetic, environmental and psychological contributors and the individual risk factors. Language markers for mental disorders can help to diagnose a person. Research thus far on language markers and the associated mental disorders has been done mainly with the Linguistic Inquiry and Word Count (LIWC) program. In order to improve on this research, we employed a range of Natural Language Processing (NLP) techniques using LIWC, spaCy, fastText and RobBERT to analyse Dutch psychiatric interview transcriptions with both rule-based and vector-based approaches. Our primary objective was to predict whether a patient had been diagnosed with a mental disorder, and if so, the specific mental disorder type. Furthermore, the second goal of this research was to find out which words are language markers for which mental disorder. LIWC in combination with the random forest classification algorithm performed best in predicting whether a person had a mental disorder or not (accuracy: 0.952; Cohen's kappa: 0.889). SpaCy in combination with random forest predicted best which particular mental disorder a patient had been diagnosed with (accuracy: 0.429; Cohen's kappa: 0.304). Show less
Purpose To learn a preconditioner that accelerates parallel imaging (PI) and compressed sensing (CS) reconstructions. Methods A convolutional neural network (CNN) with residual connections was used... Show morePurpose To learn a preconditioner that accelerates parallel imaging (PI) and compressed sensing (CS) reconstructions. Methods A convolutional neural network (CNN) with residual connections was used to train a preconditioning operator. Training and validation data were simulated using 50% brain images and 50% white Gaussian noise images. Each multichannel training example contains a simulated sampling mask, complex coil sensitivity maps, and two regularization parameter maps. The trained model was integrated in the preconditioned conjugate gradient (PCG) method as part of the split Bregman CS method. The acceleration performance was compared with that of a circulant PI-CS preconditioner for varying undersampling factors, number of coil elements and anatomies. Results The learned preconditioner reduces the number of PCG iterations by a factor of 4, yielding a similar acceleration as an efficient circulant preconditioner. The method generalizes well to different sampling schemes, coil configurations and anatomies. Conclusion It is possible to learn adaptable preconditioners for PI and CS reconstructions that meet the performance of state-of-the-art preconditioners. Further acceleration could be achieved by optimizing the network architecture and the training set. Such a preconditioner could also be integrated in fully learned reconstruction methods to accelerate the training process of unrolled networks. Show less
Su, R.S.; Cornelissen, S.A.P.; Sluijs, M. van der; Es, A.C.G.M. van; Zwam, W.H. van; Dippel, D.W.J.; ... ; Walsum, T. van 2021
The Thrombolysis in Cerebral Infarction (TICI) score is an important metric for reperfusion therapy assessment in acute ischemic stroke. It is commonly used as a technical outcome measure after... Show moreThe Thrombolysis in Cerebral Infarction (TICI) score is an important metric for reperfusion therapy assessment in acute ischemic stroke. It is commonly used as a technical outcome measure after endovascular treatment (EVT). Existing TICI scores are defined in coarse ordinal grades based on visual inspection, leading to inter- and intra-observer variation. In this work, we present autoTICI, an automatic and quantitative TICI scoring method. First, each digital subtraction angiography (DSA) acquisition is separated into four phases (non-contrast, arterial, parenchymal and venous phase) using a multi-path convolutional neural network (CNN), which exploits spatio-temporal features. The network also incorporates sequence level label dependencies in the form of a state-transition matrix. Next, a minimum intensity map (MINIP) is computed using the motion corrected arterial and parenchymal frames. On the MINIP image, vessel, perfusion and background pixels are segmented. Finally, we quantify the autoTICI score as the ratio of reperfused pixels after EVT. On a routinely acquired multi-center dataset, the proposed autoTICI shows good correlation with the extended TICI (eTICI) reference with an average area under the curve (AUC) score of 0.81. The AUC score is 0.90 with respect to the dichotomized eTICI. In terms of clinical outcome prediction, we demonstrate that autoTICI is overall comparable to eTICI. Show less
OBJECTIVES This study designed and evaluated an end-to-end deep learning solution for cardiac segmentation and quantification.BACKGROUND Segmentation of cardiac structures from coronary computed... Show moreOBJECTIVES This study designed and evaluated an end-to-end deep learning solution for cardiac segmentation and quantification.BACKGROUND Segmentation of cardiac structures from coronary computed tomography angiography (CCTA) images is laborious. We designed an end-to-end deep-learning solution.METHODS Scans were obtained from multicenter registries of 166 patients who underwent clinically indicated CCTA. Left ventricular volume (LVV) and right ventricular volume (RVV), left atrial volume (LAV) and right atrial volume (RAV), and left ventricular myocardial mass (LVM) were manually annotated as ground truth. A U-Net-inspired, deep-learning model was trained, validated, and tested in a 70:20:10 split.RESULTS Mean age was 61.1 +/- 8.4 years, and 49% were women. A combined overall median Dice score of 0.9246 (interquartile range: 0.8870 to 0.9475) was achieved. The median Dice scores for LVV, RVV, LAV, RAV, and LVM were 0.938 (interquartile range: 0.887 to 0.958), 0.927 (interquartile range: 0.916 to 0.946), 0.934 (interquartile range: 0.899 to 0.950), 0.915 (interquartile range: 0.890 to 0.920), and 0.920 (interquartile range: 0.811 to 0.944), respectively. Model prediction correlated and agreed well with manual annotation for LVV (r = 0.98), RVV (r = 0.97), LAV (r = 0.78), RAV (r = 0.97), and LVM (r = 0.94) (p < 0.05 for all). Mean difference and limits of agreement for LVV, RVV, LAV, RAV, and LVM were 1.20 ml (95% CI: -7.12 to 9.51), -0.78 ml (95% CI: -10.08 to 8.52), -3.75 ml (95% CI: -21.53 to 14.03), 0.97 ml (95% CI: -6.14 to 8.09), and 6.41 g (95% CI: -8.71 to 21.52), respectively.CONCLUSIONS A deep-learning model rapidly segmented and quantified cardiac structures. This was done with high accuracy on a pixel level, with good agreement with manual annotation, facilitating its expansion into areas of research and clinical import. (C) 2020 by the American College of Cardiology Foundation. Show less
Tao, Q.; Lelieveldt, B.P.F.; Geest, R.J. van der 2020
OBJECTIVE. The recent advancement of deep learning techniques has profoundly impacted research on quantitative cardiac MRI analysis. The purpose of this article is to introduce the concept of deep... Show moreOBJECTIVE. The recent advancement of deep learning techniques has profoundly impacted research on quantitative cardiac MRI analysis. The purpose of this article is to introduce the concept of deep learning, review its current applications on quantitative cardiac MRI, and discuss its limitations and challenges.CONCLUSION. Deep learning has shown state-of-the-art performance on quantitative analysis of multiple cardiac MRI sequences and holds great promise for future use in clinical practice and scientific research. Show less
Pezzotti, N.; Yousefi, S.; Elmahdy, M.S.; Gemert, J.H.F. van; Schuelke, C.; Doneva, M.; ... ; Staring, M. 2020
Adaptive intelligence aims at empowering machine learning techniques with the additional use of domain knowledge. In this work, we present the application of adaptive intelligence to accelerate MR... Show moreAdaptive intelligence aims at empowering machine learning techniques with the additional use of domain knowledge. In this work, we present the application of adaptive intelligence to accelerate MR acquisition. Starting from undersampled k-space data, an iterative learning-based reconstruction scheme inspired by compressed sensing theory is used to reconstruct the images. We developed a novel deep neural network to refine and correct prior reconstruction assumptions given the training data. The network was trained and tested on a knee MRI dataset from the 2019 fastMRI challenge organized by Facebook AI Research and NYU Langone Health. All submissions to the challenge were initially ranked based on similarity with a known groundtruth, after which the top 4 submissions were evaluated radiologically. Our method was evaluated by the fastMRI organizers on an independent challenge dataset. It ranked #1, shared #1, and #3 on respectively the 8x accelerated multi-coil, the 4x multi-coil, and the 4x single-coil tracks. This demonstrates the superior performance and wide applicability of the method. Show less
Higher dimensional data such as video and 3D are the leading edge of multimedia retrieval and computer vision research. In this survey, we give a comprehensive overview and key insights into the... Show moreHigher dimensional data such as video and 3D are the leading edge of multimedia retrieval and computer vision research. In this survey, we give a comprehensive overview and key insights into the state of the art of higher dimensional features from deep learning and also traditional approaches. Current approaches are frequently using 3D information from the sensor or are using 3D in modeling and understanding the 3D world. With the growth of prevalent application areas such as 3D games, self-driving automobiles, health monitoring and sports activity training, a wide variety of new sensors have allowed researchers to develop feature description models beyond 2D. Although higher dimensional data enhance the performance of methods on numerous tasks, they can also introduce new challenges and problems. The higher dimensionality of the data often leads to more complicated structures which present additional problems in both extracting meaningful content and in adapting it for current machine learning algorithms. Due to the major importance of the evaluation process, we also present an overview of the current datasets and benchmarks. Moreover, based on more than 330 papers from this study, we present the major challenges and future directions. Show less
In numerous multimedia and multi-modal tasks from image and video retrieval to zero-shot recognition to multimedia question and answering, bridging image and text representations plays an... Show moreIn numerous multimedia and multi-modal tasks from image and video retrieval to zero-shot recognition to multimedia question and answering, bridging image and text representations plays an important and in some cases an indispensable role. To narrow the modality gap between vision and language, prior approaches attempt to discover their correlated semantics in a common feature space. However, these approaches omit the intra-modal semantic consistency when learning the inter-modal correlations. To address this problem, we propose cycle-consistent embeddings in a deep neural network for matching visual and textual representations. Our approach named as CycleMatch can maintain both inter-modal correlations and intra-modal consistency by cascading dual mappings and reconstructed mappings in a cyclic fashion. Moreover, in order to achieve a robust inference, we propose to employ two late-fusion approaches: average fusion and adaptive fusion. Both of them can effectively integrate the matching scores of different embedding features, without increasing the network complexity and training time. In the experiments on cross-modal retrieval, we demonstrate comprehensive results to verify the effectiveness of the proposed approach. Our approach achieves state-of-the-art performance on two well-known multi-modal datasets, Flickr30K and MSCOCO. Show less