-
Location-based Radiology Report-Guided Semi-supervised Learning for Prostate Cancer Detection
Authors:
Alex Chen,
Nathan Lay,
Stephanie Harmon,
Kutsev Ozyoruk,
Enis Yilmaz,
Brad J. Wood,
Peter A. Pinto,
Peter L. Choyke,
Baris Turkbey
Abstract:
Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locat…
▽ More
Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locations in radiology reports, allowing for use of unannotated images to reduce the annotation burden. By leveraging lesion locations, we refined pseudo labels, which were then used to train our location-based SSL model. We show that our SSL method can improve prostate lesion detection by utilizing unannotated images, with more substantial impacts being observed when larger proportions of unannotated images are used.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography
Authors:
Yufan He,
Pengfei Guo,
Yucheng Tang,
Andriy Myronenko,
Vishwesh Nath,
Ziyue Xu,
Dong Yang,
Can Zhao,
Benjamin Simon,
Mason Belue,
Stephanie Harmon,
Baris Turkbey,
Daguang Xu,
Wenqi Li
Abstract:
Medical image segmentation is a core component of precision medicine, and 3D computed tomography (CT) is one of the most important imaging techniques. A highly accurate and clinically applicable segmentation foundation model will greatly facilitate clinicians and researchers using CT images. Although existing foundation models have attracted great interest, none are adequate for 3D CT, either beca…
▽ More
Medical image segmentation is a core component of precision medicine, and 3D computed tomography (CT) is one of the most important imaging techniques. A highly accurate and clinically applicable segmentation foundation model will greatly facilitate clinicians and researchers using CT images. Although existing foundation models have attracted great interest, none are adequate for 3D CT, either because they lack accurate automatic segmentation for large cohort analysis or the ability to segment novel classes. An ideal segmentation solution should possess two features: accurate out-of-the-box performance covering major organ classes, and effective adaptation or zero-shot ability to novel structures. To achieve this goal, we introduce Versatile Imaging SegmenTation and Annotation model (VISTA3D). VISTA3D is trained systematically on 11454 volumes and provides accurate out-of-the-box segmentation for 127 common types of human anatomical structures and various lesions. Additionally, VISTA3D supports 3D interactive segmentation, allowing convenient editing of automatic results and achieving state-of-the-art annotation results on unseen classes. The novel model design and training recipe represent a promising step toward developing a versatile medical image foundation model and will serve as a valuable foundation for CT image analysis. Code and model weights are available at https://github.com/Project-MONAI/VISTA
△ Less
Submitted 7 August, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning
Authors:
Zheyuan Zhang,
Elif Keles,
Gorkem Durak,
Yavuz Taktak,
Onkar Susladkar,
Vandan Gorade,
Debesh Jha,
Asli C. Ormeci,
Alpay Medetalibeyoglu,
Lanhong Yao,
Bin Wang,
Ilkin Sevgi Isler,
Linkai Peng,
Hongyi Pan,
Camila Lopes Vendrami,
Amir Bourhani,
Yury Velichko,
Boqing Gong,
Concetto Spampinato,
Ayis Pyrros,
Pallavi Tiwari,
Derk C. F. Klatte,
Megan Engels,
Sanne Hoogenboom,
Candice W. Bolan
, et al. (13 additional authors not shown)
Abstract:
Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective st…
▽ More
Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective study, we collected a large dataset (767 scans from 499 participants) of T1-weighted (T1W) and T2-weighted (T2W) abdominal MRI series from five centers between March 2004 and November 2022. We also collected CT scans of 1,350 patients from publicly available sources for benchmarking purposes. We developed a new pancreas segmentation method, called PanSegNet, combining the strengths of nnUNet and a Transformer network with a new linear attention module enabling volumetric computation. We tested PanSegNet's accuracy in cross-modality (a total of 2,117 scans) and cross-center settings with Dice and Hausdorff distance (HD95) evaluation metrics. We used Cohen's kappa statistics for intra and inter-rater agreement evaluation and paired t-tests for volume and Dice comparisons, respectively. For segmentation accuracy, we achieved Dice coefficients of 88.3% (std: 7.2%, at case level) with CT, 85.0% (std: 7.9%) with T1W MRI, and 86.3% (std: 6.4%) with T2W MRI. There was a high correlation for pancreas volume prediction with R^2 of 0.91, 0.84, and 0.85 for CT, T1W, and T2W, respectively. We found moderate inter-observer (0.624 and 0.638 for T1W and T2W MRI, respectively) and high intra-observer agreement scores. All MRI data is made available at https://osf.io/kysnj/. Our source code is available at https://github.com/NUBagciLab/PaNSegNet.
△ Less
Submitted 25 May, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Detection of Peri-Pancreatic Edema using Deep Learning and Radiomics Techniques
Authors:
Ziliang Hong,
Debesh Jha,
Koushik Biswas,
Zheyuan Zhang,
Yury Velichko,
Cemal Yazici,
Temel Tirkes,
Amir Borhani,
Baris Turkbey,
Alpay Medetalibeyoglu,
Gorkem Durak,
Ulas Bagci
Abstract:
Identifying peri-pancreatic edema is a pivotal indicator for identifying disease progression and prognosis, emphasizing the critical need for accurate detection and assessment in pancreatitis diagnosis and management. This study \textit{introduces a novel CT dataset sourced from 255 patients with pancreatic diseases, featuring annotated pancreas segmentation masks and corresponding diagnostic labe…
▽ More
Identifying peri-pancreatic edema is a pivotal indicator for identifying disease progression and prognosis, emphasizing the critical need for accurate detection and assessment in pancreatitis diagnosis and management. This study \textit{introduces a novel CT dataset sourced from 255 patients with pancreatic diseases, featuring annotated pancreas segmentation masks and corresponding diagnostic labels for peri-pancreatic edema condition}. With the novel dataset, we first evaluate the efficacy of the \textit{LinTransUNet} model, a linear Transformer based segmentation algorithm, to segment the pancreas accurately from CT imaging data. Then, we use segmented pancreas regions with two distinctive machine learning classifiers to identify existence of peri-pancreatic edema: deep learning-based models and a radiomics-based eXtreme Gradient Boosting (XGBoost). The LinTransUNet achieved promising results, with a dice coefficient of 80.85\%, and mIoU of 68.73\%. Among the nine benchmarked classification models for peri-pancreatic edema detection, \textit{Swin-Tiny} transformer model demonstrated the highest recall of $98.85 \pm 0.42$ and precision of $98.38\pm 0.17$. Comparatively, the radiomics-based XGBoost model achieved an accuracy of $79.61\pm4.04$ and recall of $91.05\pm3.28$, showcasing its potential as a supplementary diagnostic tool given its rapid processing speed and reduced training time. Our code is available \url{https://github.com/NUBagciLab/Peri-Pancreatic-Edema-Detection}.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
A Probabilistic Hadamard U-Net for MRI Bias Field Correction
Authors:
Xin Zhu,
Hongyi Pan,
Yury Velichko,
Adam B. Murphy,
Ashley Ross,
Baris Turkbey,
Ahmet Enis Cetin,
Ulas Bagci
Abstract:
Magnetic field inhomogeneity correction remains a challenging task in MRI analysis. Most established techniques are designed for brain MRI by supposing that image intensities in the identical tissue follow a uniform distribution. Such an assumption cannot be easily applied to other organs, especially those that are small in size and heterogeneous in texture (large variations in intensity), such as…
▽ More
Magnetic field inhomogeneity correction remains a challenging task in MRI analysis. Most established techniques are designed for brain MRI by supposing that image intensities in the identical tissue follow a uniform distribution. Such an assumption cannot be easily applied to other organs, especially those that are small in size and heterogeneous in texture (large variations in intensity), such as the prostate. To address this problem, this paper proposes a probabilistic Hadamard U-Net (PHU-Net) for prostate MRI bias field correction. First, a novel Hadamard U-Net (HU-Net) is introduced to extract the low-frequency scalar field, multiplied by the original input to obtain the prototypical corrected image. HU-Net converts the input image from the time domain into the frequency domain via Hadamard transform. In the frequency domain, high-frequency components are eliminated using the trainable filter (scaling layer), hard-thresholding layer, and sparsity penalty. Next, a conditional variational autoencoder is used to encode possible bias field-corrected variants into a low-dimensional latent space. Random samples drawn from latent space are then incorporated with a prototypical corrected image to generate multiple plausible images. Experimental results demonstrate the effectiveness of PHU-Net in correcting bias-field in prostate MRI with a fast inference speed. It has also been shown that prostate MRI segmentation accuracy improves with the high-quality corrected images from PHU-Net. The code will be available in the final version of this manuscript.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Using YOLO v7 to Detect Kidney in Magnetic Resonance Imaging
Authors:
Pouria Yazdian Anari,
Fiona Obiezu,
Nathan Lay,
Fatemeh Dehghani Firouzabadi,
Aditi Chaurasia,
Mahshid Golagha,
Shiva Singh,
Fatemeh Homayounieh,
Aryan Zahergivar,
Stephanie Harmon,
Evrim Turkbey,
Rabindra Gautam,
Kevin Ma,
Maria Merino,
Elizabeth C. Jones,
Mark W. Ball,
W. Marston Linehan,
Baris Turkbey,
Ashkan A. Malayeri
Abstract:
Introduction This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats. Methods Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys. A total of 5657 MRI scans for 1084 patients were r…
▽ More
Introduction This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats. Methods Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys. A total of 5657 MRI scans for 1084 patients were retrieved. 326 patients with 1034 tumors recruited from a retrospective maintained database, and bounding boxes were drawn around their tumors. A primary model was trained on 80% of annotated cases, with 20% saved for testing (primary test set). The best primary model was then used to identify tumors in the remaining 861 patients and bounding box coordinates were generated on their scans using the model. Ten benchmark training sets were created with generated coordinates on not-segmented patients. The final model used to predict the kidney in the primary test set. We reported the positive predictive value (PPV), sensitivity, and mean average precision (mAP). Results The primary training set showed an average PPV of 0.94 +/- 0.01, sensitivity of 0.87 +/- 0.04, and mAP of 0.91 +/- 0.02. The best primary model yielded a PPV of 0.97, sensitivity of 0.92, and mAP of 0.95. The final model demonstrated an average PPV of 0.95 +/- 0.03, sensitivity of 0.98 +/- 0.004, and mAP of 0.95 +/- 0.01. Conclusion Using a semi-supervised approach with a medical image library, we developed a high-performing model for kidney detection. Further external validation is required to assess the model's generalizability.
△ Less
Submitted 12 February, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray Classification
Authors:
Bin Wang,
Hongyi Pan,
Armstrong Aboah,
Zheyuan Zhang,
Elif Keles,
Drew Torigian,
Baris Turkbey,
Elizabeth Krupinski,
Jayaram Udupa,
Ulas Bagci
Abstract:
Eye tracking research is important in computer vision because it can help us understand how humans interact with the visual world. Specifically for high-risk applications, such as in medical imaging, eye tracking can help us to comprehend how radiologists and other medical professionals search, analyze, and interpret images for diagnostic and clinical purposes. Hence, the application of eye tracki…
▽ More
Eye tracking research is important in computer vision because it can help us understand how humans interact with the visual world. Specifically for high-risk applications, such as in medical imaging, eye tracking can help us to comprehend how radiologists and other medical professionals search, analyze, and interpret images for diagnostic and clinical purposes. Hence, the application of eye tracking techniques in disease classification has become increasingly popular in recent years. Contemporary works usually transform gaze information collected by eye tracking devices into visual attention maps (VAMs) to supervise the learning process. However, this is a time-consuming preprocessing step, which stops us from applying eye tracking to radiologists' daily work. To solve this problem, we propose a novel gaze-guided graph neural network (GNN), GazeGNN, to leverage raw eye-gaze data without being converted into VAMs. In GazeGNN, to directly integrate eye gaze into image classification, we create a unified representation graph that models both images and gaze pattern information. With this benefit, we develop a real-time, real-world, end-to-end disease classification algorithm for the first time in the literature. This achievement demonstrates the practicality and feasibility of integrating real-time eye tracking techniques into the daily work of radiologists. To our best knowledge, GazeGNN is the first work that adopts GNN to integrate image and eye-gaze data. Our experiments on the public chest X-ray dataset show that our proposed method exhibits the best classification performance compared to existing methods. The code is available at https://github.com/ukaukaaaa/GazeGNN.
△ Less
Submitted 29 August, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Domain Generalization with Adversarial Intensity Attack for Medical Image Segmentation
Authors:
Zheyuan Zhang,
Bin Wang,
Lanhong Yao,
Ugur Demir,
Debesh Jha,
Ismail Baris Turkbey,
Boqing Gong,
Ulas Bagci
Abstract:
Most statistical learning algorithms rely on an over-simplified assumption, that is, the train and test data are independent and identically distributed. In real-world scenarios, however, it is common for models to encounter data from new and different domains to which they were not exposed to during training. This is often the case in medical imaging applications due to differences in acquisition…
▽ More
Most statistical learning algorithms rely on an over-simplified assumption, that is, the train and test data are independent and identically distributed. In real-world scenarios, however, it is common for models to encounter data from new and different domains to which they were not exposed to during training. This is often the case in medical imaging applications due to differences in acquisition devices, imaging protocols, and patient characteristics. To address this problem, domain generalization (DG) is a promising direction as it enables models to handle data from previously unseen domains by learning domain-invariant features robust to variations across different domains. To this end, we introduce a novel DG method called Adversarial Intensity Attack (AdverIN), which leverages adversarial training to generate training data with an infinite number of styles and increase data diversity while preserving essential content information. We conduct extensive evaluation experiments on various multi-domain segmentation datasets, including 2D retinal fundus optic disc/cup and 3D prostate MRI. Our results demonstrate that AdverIN significantly improves the generalization ability of the segmentation models, achieving significant improvement on these challenging datasets. Code is available upon publication.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Distance Map Supervised Landmark Localization for MR-TRUS Registration
Authors:
Xinrui Song,
Xuanang Xu,
Sheng Xu,
Baris Turkbey,
Bradford J. Wood,
Thomas Sanford,
Pingkun Yan
Abstract:
In this work, we propose to explicitly use the landmarks of prostate to guide the MR-TRUS image registration. We first train a deep neural network to automatically localize a set of meaningful landmarks, and then directly generate the affine registration matrix from the location of these landmarks. For landmark localization, instead of directly training a network to predict the landmark coordinate…
▽ More
In this work, we propose to explicitly use the landmarks of prostate to guide the MR-TRUS image registration. We first train a deep neural network to automatically localize a set of meaningful landmarks, and then directly generate the affine registration matrix from the location of these landmarks. For landmark localization, instead of directly training a network to predict the landmark coordinates, we propose to regress a full-resolution distance map of the landmark, which is demonstrated effective in avoiding statistical bias to unsatisfactory performance and thus improving performance. We then use the predicted landmarks to generate the affine transformation matrix, which outperforms the clinicians' manual rigid registration by a significant margin in terms of TRE.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation
Authors:
Pengfei Guo,
Dong Yang,
Ali Hatamizadeh,
An Xu,
Ziyue Xu,
Wenqi Li,
Can Zhao,
Daguang Xu,
Stephanie Harmon,
Evrim Turkbey,
Baris Turkbey,
Bradford Wood,
Francesca Patella,
Elvira Stellato,
Gianpaolo Carrafiello,
Vishal M. Patel,
Holger R. Roth
Abstract:
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning t…
▽ More
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.
△ Less
Submitted 31 August, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Cross-modal Attention for MRI and Ultrasound Volume Registration
Authors:
Xinrui Song,
Hengtao Guo,
Xuanang Xu,
Hanqing Chao,
Sheng Xu,
Baris Turkbey,
Bradford J. Wood,
Ge Wang,
Pingkun Yan
Abstract:
Prostate cancer biopsy benefits from accurate fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images. In the past few years, convolutional neural networks (CNNs) have been proved powerful in extracting image features crucial for image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are quite limited in its ability to unde…
▽ More
Prostate cancer biopsy benefits from accurate fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images. In the past few years, convolutional neural networks (CNNs) have been proved powerful in extracting image features crucial for image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are quite limited in its ability to understand spatial correspondence between features, a task in which the self-attention mechanism excels. This paper aims to develop a self-attention mechanism specifically for cross-modal image registration. Our proposed cross-modal attention block effectively maps each of the features in one volume to all features in the corresponding volume. Our experimental results demonstrate that a CNN network designed with the cross-modal attention block embedded outperforms an advanced CNN network 10 times of its size. We also incorporated visualization techniques to improve the interpretability of our network. The source code of our work is available at https://github.com/DIAL-RPI/Attention-Reg .
△ Less
Submitted 11 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.
-
Auto-FedAvg: Learnable Federated Averaging for Multi-Institutional Medical Image Segmentation
Authors:
Yingda Xia,
Dong Yang,
Wenqi Li,
Andriy Myronenko,
Daguang Xu,
Hirofumi Obinata,
Hitoshi Mori,
Peng An,
Stephanie Harmon,
Evrim Turkbey,
Baris Turkbey,
Bradford Wood,
Francesca Patella,
Elvira Stellato,
Gianpaolo Carrafiello,
Anna Ierardi,
Alan Yuille,
Holger Roth
Abstract:
Federated learning (FL) enables collaborative model training while preserving each participant's privacy, which is particularly beneficial to the medical field. FedAvg is a standard algorithm that uses fixed weights, often originating from the dataset sizes at each client, to aggregate the distributed learned models on a server during the FL process. However, non-identical data distribution across…
▽ More
Federated learning (FL) enables collaborative model training while preserving each participant's privacy, which is particularly beneficial to the medical field. FedAvg is a standard algorithm that uses fixed weights, often originating from the dataset sizes at each client, to aggregate the distributed learned models on a server during the FL process. However, non-identical data distribution across clients, known as the non-i.i.d problem in FL, could make this assumption for setting fixed aggregation weights sub-optimal. In this work, we design a new data-driven approach, namely Auto-FedAvg, where aggregation weights are dynamically adjusted, depending on data distributions across data silos and the current training progress of the models. We disentangle the parameter set into two parts, local model parameters and global aggregation parameters, and update them iteratively with a communication-efficient algorithm. We first show the validity of our approach by outperforming state-of-the-art FL methods for image recognition on a heterogeneous data split of CIFAR-10. Furthermore, we demonstrate our algorithm's effectiveness on two multi-institutional medical image analysis tasks, i.e., COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Information Bottleneck Attribution for Visual Explanations of Diagnosis and Prognosis
Authors:
Ugur Demir,
Ismail Irmakci,
Elif Keles,
Ahmet Topcu,
Ziyue Xu,
Concetto Spampinato,
Sachin Jambawalikar,
Evrim Turkbey,
Baris Turkbey,
Ulas Bagci
Abstract:
Visual explanation methods have an important role in the prognosis of the patients where the annotated data is limited or unavailable. There have been several attempts to use gradient-based attribution methods to localize pathology from medical scans without using segmentation labels. This research direction has been impeded by the lack of robustness and reliability. These methods are highly sensi…
▽ More
Visual explanation methods have an important role in the prognosis of the patients where the annotated data is limited or unavailable. There have been several attempts to use gradient-based attribution methods to localize pathology from medical scans without using segmentation labels. This research direction has been impeded by the lack of robustness and reliability. These methods are highly sensitive to the network parameters. In this study, we introduce a robust visual explanation method to address this problem for medical applications. We provide an innovative visual explanation algorithm for general purpose and as an example application, we demonstrate its effectiveness for quantifying lesions in the lungs caused by the Covid-19 with high accuracy and robustness without using dense segmentation labels. This approach overcomes the drawbacks of commonly used Grad-CAM and its extended versions. The premise behind our proposed strategy is that the information flow is minimized while ensuring the classifier prediction stays similar. Our findings indicate that the bottleneck condition provides a more stable severity estimation than the similar attribution methods.
△ Less
Submitted 22 June, 2021; v1 submitted 6 April, 2021;
originally announced April 2021.
-
Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan
Authors:
Dong Yang,
Ziyue Xu,
Wenqi Li,
Andriy Myronenko,
Holger R. Roth,
Stephanie Harmon,
Sheng Xu,
Baris Turkbey,
Evrim Turkbey,
Xiaosong Wang,
Wentao Zhu,
Gianpaolo Carrafiello,
Francesca Patella,
Maurizio Cariati,
Hirofumi Obinata,
Hitoshi Mori,
Kaku Tamura,
Peng An,
Bradford J. Wood,
Daguang Xu
Abstract:
The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and di…
▽ More
The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
Multi-Domain Image Completion for Random Missing Input Data
Authors:
Liyue Shen,
Wentao Zhu,
Xiaosong Wang,
Lei Xing,
John M. Pauly,
Baris Turkbey,
Stephanie Anne Harmon,
Thomas Hogue Sanford,
Sherif Mehralivand,
Peter Choyke,
Bradford Wood,
Daguang Xu
Abstract:
Multi-domain data are widely leveraged in vision applications taking advantage of complementary information from different modalities, e.g., brain tumor segmentation from multi-parametric magnetic resonance imaging (MRI). However, due to possible data corruption and different imaging protocols, the availability of images for each domain could vary amongst multiple data sources in practice, which m…
▽ More
Multi-domain data are widely leveraged in vision applications taking advantage of complementary information from different modalities, e.g., brain tumor segmentation from multi-parametric magnetic resonance imaging (MRI). However, due to possible data corruption and different imaging protocols, the availability of images for each domain could vary amongst multiple data sources in practice, which makes it challenging to build a universal model with a varied set of input data. To tackle this problem, we propose a general approach to complete the random missing domain(s) data in real applications. Specifically, we develop a novel multi-domain image completion method that utilizes a generative adversarial network (GAN) with a representational disentanglement scheme to extract shared skeleton encoding and separate flesh encoding across multiple domains. We further illustrate that the learned representation in multi-domain image completion could be leveraged for high-level tasks, e.g., segmentation, by introducing a unified framework consisting of image completion and segmentation with a shared content encoder. The experiments demonstrate consistent performance improvement on three datasets for brain tumor segmentation, prostate segmentation, and facial expression image completion respectively.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Adipose Tissue Segmentation in Unlabeled Abdomen MRI using Cross Modality Domain Adaptation
Authors:
Samira Masoudi,
Syed M. Anwar,
Stephanie A. Harmon,
Peter L. Choyke,
Baris Turkbey,
Ulas Bagci
Abstract:
Abdominal fat quantification is critical since multiple vital organs are located within this region. Although computed tomography (CT) is a highly sensitive modality to segment body fat, it involves ionizing radiations which makes magnetic resonance imaging (MRI) a preferable alternative for this purpose. Additionally, the superior soft tissue contrast in MRI could lead to more accurate results. Y…
▽ More
Abdominal fat quantification is critical since multiple vital organs are located within this region. Although computed tomography (CT) is a highly sensitive modality to segment body fat, it involves ionizing radiations which makes magnetic resonance imaging (MRI) a preferable alternative for this purpose. Additionally, the superior soft tissue contrast in MRI could lead to more accurate results. Yet, it is highly labor intensive to segment fat in MRI scans. In this study, we propose an algorithm based on deep learning technique(s) to automatically quantify fat tissue from MR images through a cross modality adaptation. Our method does not require supervised labeling of MR scans, instead, we utilize a cycle generative adversarial network (C-GAN) to construct a pipeline that transforms the existing MR scans into their equivalent synthetic CT (s-CT) images where fat segmentation is relatively easier due to the descriptive nature of HU (hounsfield unit) in CT images. The fat segmentation results for MRI scans were evaluated by expert radiologist. Qualitative evaluation of our segmentation results shows average success score of 3.80/5 and 4.54/5 for visceral and subcutaneous fat segmentation in MR images.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.
-
When Unseen Domain Generalization is Unnecessary? Rethinking Data Augmentation
Authors:
Ling Zhang,
Xiaosong Wang,
Dong Yang,
Thomas Sanford,
Stephanie Harmon,
Baris Turkbey,
Holger Roth,
Andriy Myronenko,
Daguang Xu,
Ziyue Xu
Abstract:
Recent advances in deep learning for medical image segmentation demonstrate expert-level accuracy. However, in clinically realistic environments, such methods have marginal performance due to differences in image domains, including different imaging protocols, device vendors and patient populations. Here we consider the problem of domain generalization, when a model is trained once, and its perfor…
▽ More
Recent advances in deep learning for medical image segmentation demonstrate expert-level accuracy. However, in clinically realistic environments, such methods have marginal performance due to differences in image domains, including different imaging protocols, device vendors and patient populations. Here we consider the problem of domain generalization, when a model is trained once, and its performance generalizes to unseen domains. Intuitively, within a specific medical imaging modality the domain differences are smaller relative to natural images domain variability. We rethink data augmentation for medical 3D images and propose a deep stacked transformations (DST) approach for domain generalization. Specifically, a series of n stacked transformations are applied to each image in each mini-batch during network training to account for the contribution of domain-specific shifts in medical images. We comprehensively evaluate our method on three tasks: segmentation of whole prostate from 3D MRI, left atrial from 3D MRI, and left ventricle from 3D ultrasound. We demonstrate that when trained on a small source dataset, (i) on average, DST models on unseen datasets degrade only by 11% (Dice score change), compared to the conventional augmentation (degrading 39%) and CycleGAN-based domain adaptation method (degrading 25%); (ii) when evaluation on the same domain, DST is also better albeit only marginally. (iii) When training on large-sized data, DST on unseen domains reaches performance of state-of-the-art fully supervised models. These findings establish a strong benchmark for the study of domain generalization in medical imaging, and can be generalized to the design of robust deep segmentation models for clinical deployment.
△ Less
Submitted 12 June, 2019; v1 submitted 7 June, 2019;
originally announced June 2019.
-
A Collaborative Computer Aided Diagnosis (C-CAD) System with Eye-Tracking, Sparse Attentional Model, and Deep Learning
Authors:
Naji Khosravan,
Haydar Celik,
Baris Turkbey,
Elizabeth Jones,
Bradford Wood,
Ulas Bagci
Abstract:
There are at least two categories of errors in radiology screening that can lead to suboptimal diagnostic decisions and interventions:(i)human fallibility and (ii)complexity of visual search. Computer aided diagnostic (CAD) tools are developed to help radiologists to compensate for some of these errors. However, despite their significant improvements over conventional screening strategies, most CA…
▽ More
There are at least two categories of errors in radiology screening that can lead to suboptimal diagnostic decisions and interventions:(i)human fallibility and (ii)complexity of visual search. Computer aided diagnostic (CAD) tools are developed to help radiologists to compensate for some of these errors. However, despite their significant improvements over conventional screening strategies, most CAD systems do not go beyond their use as second opinion tools due to producing a high number of false positives, which human interpreters need to correct. In parallel with efforts in computerized analysis of radiology scans, several researchers have examined behaviors of radiologists while screening medical images to better understand how and why they miss tumors, how they interact with the information in an image, and how they search for unknown pathology in the images. Eye-tracking tools have been instrumental in exploring answers to these fundamental questions. In this paper, we aim to develop a paradigm shift CAD system, called collaborative CAD (C-CAD), that unifies both of the above mentioned research lines: CAD and eye-tracking. We design an eye-tracking interface providing radiologists with a real radiology reading room experience. Then, we propose a novel algorithm that unifies eye-tracking data and a CAD system. Specifically, we present a new graph based clustering and sparsification algorithm to transform eye-tracking data (gaze) into a signal model to interpret gaze patterns quantitatively and qualitatively. The proposed C-CAD collaborates with radiologists via eye-tracking technology and helps them to improve diagnostic decisions. The C-CAD learns radiologists' search efficiency by processing their gaze patterns. To do this, the C-CAD uses a deep learning algorithm in a newly designed multi-task learning platform to segment and diagnose cancers simultaneously.
△ Less
Submitted 28 April, 2018; v1 submitted 17 February, 2018;
originally announced February 2018.
-
Deeply-Supervised CNN for Prostate Segmentation
Authors:
Qikui Zhu,
Bo Du,
Baris Turkbey,
Peter L . Choyke,
Pingkun Yan
Abstract:
Prostate segmentation from Magnetic Resonance (MR) images plays an important role in image guided interven- tion. However, the lack of clear boundary specifically at the apex and base, and huge variation of shape and texture between the images from different patients make the task very challenging. To overcome these problems, in this paper, we propose a deeply supervised convolutional neural netwo…
▽ More
Prostate segmentation from Magnetic Resonance (MR) images plays an important role in image guided interven- tion. However, the lack of clear boundary specifically at the apex and base, and huge variation of shape and texture between the images from different patients make the task very challenging. To overcome these problems, in this paper, we propose a deeply supervised convolutional neural network (CNN) utilizing the convolutional information to accurately segment the prostate from MR images. The proposed model can effectively detect the prostate region with additional deeply supervised layers compared with other approaches. Since some information will be abandoned after convolution, it is necessary to pass the features extracted from early stages to later stages. The experimental results show that significant segmentation accuracy improvement has been achieved by our proposed method compared to other reported approaches.
△ Less
Submitted 28 March, 2017; v1 submitted 22 March, 2017;
originally announced March 2017.
-
Gaze2Segment: A Pilot Study for Integrating Eye-Tracking Technology into Medical Image Segmentation
Authors:
Naji Khosravan,
Haydar Celik,
Baris Turkbey,
Ruida Cheng,
Evan McCreedy,
Matthew McAuliffe,
Sandra Bednarova,
Elizabeth Jones,
Xinjian Chen,
Peter L. Choyke,
Bradford J. Wood,
Ulas Bagci
Abstract:
This study introduced a novel system, called Gaze2Segment, integrating biological and computer vision techniques to support radiologists' reading experience with an automatic image segmentation task. During diagnostic assessment of lung CT scans, the radiologists' gaze information were used to create a visual attention map. This map was then combined with a computer-derived saliency map, extracted…
▽ More
This study introduced a novel system, called Gaze2Segment, integrating biological and computer vision techniques to support radiologists' reading experience with an automatic image segmentation task. During diagnostic assessment of lung CT scans, the radiologists' gaze information were used to create a visual attention map. This map was then combined with a computer-derived saliency map, extracted from the gray-scale CT images. The visual attention map was used as an input for indicating roughly the location of a object of interest. With computer-derived saliency information, on the other hand, we aimed at finding foreground and background cues for the object of interest. At the final step, these cues were used to initiate a seed-based delineation process. Segmentation accuracy of the proposed Gaze2Segment was found to be 86% with dice similarity coefficient and 1.45 mm with Hausdorff distance. To the best of our knowledge, Gaze2Segment is the first true integration of eye-tracking technology into a medical image segmentation task without the need for any further user-interaction.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.
-
Deep convolutional networks for pancreas segmentation in CT imaging
Authors:
Holger R. Roth,
Amal Farag,
Le Lu,
Evrim B. Turkbey,
Ronald M. Summers
Abstract:
Automatic organ segmentation is an important prerequisite for many computer-aided diagnosis systems. The high anatomical variability of organs in the abdomen, such as the pancreas, prevents many segmentation methods from achieving high accuracies when compared to other segmentation of organs like the liver, heart or kidneys. Recently, the availability of large annotated training sets and the acces…
▽ More
Automatic organ segmentation is an important prerequisite for many computer-aided diagnosis systems. The high anatomical variability of organs in the abdomen, such as the pancreas, prevents many segmentation methods from achieving high accuracies when compared to other segmentation of organs like the liver, heart or kidneys. Recently, the availability of large annotated training sets and the accessibility of affordable parallel computing resources via GPUs have made it feasible for "deep learning" methods such as convolutional networks (ConvNets) to succeed in image classification tasks. These methods have the advantage that used classification features are trained directly from the imaging data. We present a fully-automated bottom-up method for pancreas segmentation in computed tomography (CT) images of the abdomen. The method is based on hierarchical coarse-to-fine classification of local image regions (superpixels). Superpixels are extracted from the abdominal region using Simple Linear Iterative Clustering (SLIC). An initial probability response map is generated, using patch-level confidences and a two-level cascade of random forest classifiers, from which superpixel regions with probabilities larger 0.5 are retained. These retained superpixels serve as a highly sensitive initial input of the pancreas and its surroundings to a ConvNet that samples a bounding box around each superpixel at different scales (and random non-rigid deformations at training time) in order to assign a more distinct probability of each superpixel region being pancreas or not. We evaluate our method on CT images of 82 patients (60 for training, 2 for validation, and 20 for testing). Using ConvNets we achieve average Dice scores of 68%+-10% (range, 43-80%) in testing. This shows promise for accurate pancreas segmentation, using a deep learning approach and compares favorably to state-of-the-art methods.
△ Less
Submitted 15 April, 2015;
originally announced April 2015.
-
2D View Aggregation for Lymph Node Detection Using a Shallow Hierarchy of Linear Classifiers
Authors:
Ari Seff,
Le Lu,
Kevin M. Cherry,
Holger Roth,
Jiamin Liu,
Shijun Wang,
Joanne Hoffman,
Evrim B. Turkbey,
Ronald M. Summers
Abstract:
Enlarged lymph nodes (LNs) can provide important information for cancer diagnosis, staging, and measuring treatment reactions, making automated detection a highly sought goal. In this paper, we propose a new algorithm representation of decomposing the LN detection problem into a set of 2D object detection subtasks on sampled CT slices, largely alleviating the curse of dimensionality issue. Our 2D…
▽ More
Enlarged lymph nodes (LNs) can provide important information for cancer diagnosis, staging, and measuring treatment reactions, making automated detection a highly sought goal. In this paper, we propose a new algorithm representation of decomposing the LN detection problem into a set of 2D object detection subtasks on sampled CT slices, largely alleviating the curse of dimensionality issue. Our 2D detection can be effectively formulated as linear classification on a single image feature type of Histogram of Oriented Gradients (HOG), covering a moderate field-of-view of 45 by 45 voxels. We exploit both simple pooling and sparse linear fusion schemes to aggregate these 2D detection scores for the final 3D LN detection. In this manner, detection is more tractable and does not need to perform perfectly at instance level (as weak hypotheses) since our aggregation process will robustly harness collective information for LN detection. Two datasets (90 patients with 389 mediastinal LNs and 86 patients with 595 abdominal LNs) are used for validation. Cross-validation demonstrates 78.0% sensitivity at 6 false positives/volume (FP/vol.) (86.1% at 10 FP/vol.) and 73.1% sensitivity at 6 FP/vol. (87.2% at 10 FP/vol.), for the mediastinal and abdominal datasets respectively. Our results compare favorably to previous state-of-the-art methods.
△ Less
Submitted 14 August, 2014;
originally announced August 2014.