Search | arXiv e-print repository

arXiv:2408.04826 [pdf, other]

Geo-UNet: A Geometrically Constrained Neural Framework for Clinical-Grade Lumen Segmentation in Intravascular Ultrasound

Authors: Yiming Chen, Niharika S. D'Souza, Akshith Mandepally, Patrick Henninger, Satyananda Kashyap, Neerav Karani, Neel Dey, Marcos Zachary, Raed Rizq, Paul Chouinard, Polina Golland, Tanveer F. Syeda-Mahmood

Abstract: Precisely estimating lumen boundaries in intravascular ultrasound (IVUS) is needed for sizing interventional stents to treat deep vein thrombosis (DVT). Unfortunately, current segmentation networks like the UNet lack the precision needed for clinical adoption in IVUS workflows. This arises due to the difficulty of automatically learning accurate lumen contour from limited training data while accou… ▽ More Precisely estimating lumen boundaries in intravascular ultrasound (IVUS) is needed for sizing interventional stents to treat deep vein thrombosis (DVT). Unfortunately, current segmentation networks like the UNet lack the precision needed for clinical adoption in IVUS workflows. This arises due to the difficulty of automatically learning accurate lumen contour from limited training data while accounting for the radial geometry of IVUS imaging. We propose the Geo-UNet framework to address these issues via a design informed by the geometry of the lumen contour segmentation task. We first convert the input data and segmentation targets from Cartesian to polar coordinates. Starting from a convUNet feature extractor, we propose a two-task setup, one for conventional pixel-wise labeling and the other for single boundary lumen-contour localization. We directly combine the two predictions by passing the predicted lumen contour through a new activation (named CDFeLU) to filter out spurious pixel-wise predictions. Our unified loss function carefully balances area-based, distance-based, and contour-based penalties to provide near clinical-grade generalization in unseen patient data. We also introduce a lightweight, inference-time technique to enhance segmentation smoothness. The efficacy of our framework on a venous IVUS dataset is shown against state-of-the-art models. △ Less

Submitted 8 August, 2024; originally announced August 2024.

Comments: Accepted into the 15th workshop on Machine Learning in Medical Imaging at MICCAI 2024. (* indicates equal contribution)

arXiv:2311.13016 [pdf, other]

doi 10.1109/IGARSS52108.2023.10281551

Image-Based Soil Organic Carbon Remote Sensing from Satellite Images with Fourier Neural Operator and Structural Similarity

Authors: Ken C. L. Wong, Levente Klein, Ademir Ferreira da Silva, Hongzhi Wang, Jitendra Singh, Tanveer Syeda-Mahmood

Abstract: Soil organic carbon (SOC) sequestration is the transfer and storage of atmospheric carbon dioxide in soils, which plays an important role in climate change mitigation. SOC concentration can be improved by proper land use, thus it is beneficial if SOC can be estimated at a regional or global scale. As multispectral satellite data can provide SOC-related information such as vegetation and soil prope… ▽ More Soil organic carbon (SOC) sequestration is the transfer and storage of atmospheric carbon dioxide in soils, which plays an important role in climate change mitigation. SOC concentration can be improved by proper land use, thus it is beneficial if SOC can be estimated at a regional or global scale. As multispectral satellite data can provide SOC-related information such as vegetation and soil properties at a global scale, estimation of SOC through satellite data has been explored as an alternative to manual soil sampling. Although existing studies show promising results, they are mainly based on pixel-based approaches with traditional machine learning methods, and convolutional neural networks (CNNs) are uncommon. To study the use of CNNs on SOC remote sensing, here we propose the FNO-DenseNet based on the Fourier neural operator (FNO). By combining the advantages of the FNO and DenseNet, the FNO-DenseNet outperformed the FNO in our experiments with hundreds of times fewer parameters. The FNO-DenseNet also outperformed a pixel-based random forest by 18% in the mean absolute percentage error. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: This paper was accepted by the 2023 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2023)

arXiv:2311.02332 [pdf, other]

Multimodal Machine Learning in Image-Based and Clinical Biomedicine: Survey and Prospects

Authors: Elisa Warner, Joonsang Lee, William Hsu, Tanveer Syeda-Mahmood, Charles Kahn, Olivier Gevaert, Arvind Rao

Abstract: Machine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal ML, focusing on its profound impact on medical image analysis and clinical decision support systems. Emphasizing challenges and innovations in addressing… ▽ More Machine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal ML, focusing on its profound impact on medical image analysis and clinical decision support systems. Emphasizing challenges and innovations in addressing multimodal representation, fusion, translation, alignment, and co-learning, the paper explores the transformative potential of multimodal models for clinical predictions. It also highlights the need for principled assessments and practical implementation of such models, bringing attention to the dynamics between decision support systems and healthcare providers and personnel. Despite advancements, challenges such as data biases and the scarcity of "big data" in many biomedical domains persist. We conclude with a discussion on principled innovation and collaborative efforts to further the mission of seamless integration of multimodal ML models into biomedical practice. △ Less

Submitted 19 January, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

arXiv:2310.04466 [pdf, other]

doi 10.1007/978-3-031-43901-8_35

HartleyMHA: Self-Attention in Frequency Domain for Resolution-Robust and Parameter-Efficient 3D Image Segmentation

Authors: Ken C. L. Wong, Hongzhi Wang, Tanveer Syeda-Mahmood

Abstract: With the introduction of Transformers, different attention-based models have been proposed for image segmentation with promising results. Although self-attention allows capturing of long-range dependencies, it suffers from a quadratic complexity in the image size especially in 3D. To avoid the out-of-memory error during training, input size reduction is usually required for 3D segmentation, but th… ▽ More With the introduction of Transformers, different attention-based models have been proposed for image segmentation with promising results. Although self-attention allows capturing of long-range dependencies, it suffers from a quadratic complexity in the image size especially in 3D. To avoid the out-of-memory error during training, input size reduction is usually required for 3D segmentation, but the accuracy can be suboptimal when the trained models are applied on the original image size. To address this limitation, inspired by the Fourier neural operator (FNO), we introduce the HartleyMHA model which is robust to training image resolution with efficient self-attention. FNO is a deep learning framework for learning mappings between functions in partial differential equations, which has the appealing properties of zero-shot super-resolution and global receptive field. We modify the FNO by using the Hartley transform with shared parameters to reduce the model size by orders of magnitude, and this allows us to further apply self-attention in the frequency domain for more expressive high-order feature combination with improved efficiency. When tested on the BraTS'19 dataset, it achieved superior robustness to training image resolution than other tested models with less than 1% of their model parameters. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: This paper was accepted by the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2023). arXiv admin note: text overlap with arXiv:2310.03872

arXiv:2310.03872 [pdf, other]

doi 10.1109/ISBI53787.2023.10230586

FNOSeg3D: Resolution-Robust 3D Image Segmentation with Fourier Neural Operator

Authors: Ken C. L. Wong, Hongzhi Wang, Tanveer Syeda-Mahmood

Abstract: Due to the computational complexity of 3D medical image segmentation, training with downsampled images is a common remedy for out-of-memory errors in deep learning. Nevertheless, as standard spatial convolution is sensitive to variations in image resolution, the accuracy of a convolutional neural network trained with downsampled images can be suboptimal when applied on the original resolution. To… ▽ More Due to the computational complexity of 3D medical image segmentation, training with downsampled images is a common remedy for out-of-memory errors in deep learning. Nevertheless, as standard spatial convolution is sensitive to variations in image resolution, the accuracy of a convolutional neural network trained with downsampled images can be suboptimal when applied on the original resolution. To address this limitation, we introduce FNOSeg3D, a 3D segmentation model robust to training image resolution based on the Fourier neural operator (FNO). The FNO is a deep learning framework for learning mappings between functions in partial differential equations, which has the appealing properties of zero-shot super-resolution and global receptive field. We improve the FNO by reducing its parameter requirement and enhancing its learning capability through residual connections and deep supervision, and these result in our FNOSeg3D model which is parameter efficient and resolution robust. When tested on the BraTS'19 dataset, it achieved superior robustness to training image resolution than other tested models with less than 1% of their model parameters. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: This paper was accepted by the IEEE International Symposium on Biomedical Imaging (ISBI) 2023

arXiv:2307.07093 [pdf, other]

MaxCorrMGNN: A Multi-Graph Neural Network Framework for Generalized Multimodal Fusion of Medical Data for Outcome Prediction

Authors: Niharika S. D'Souza, Hongzhi Wang, Andrea Giovannini, Antonio Foncubierta-Rodriguez, Kristen L. Beck, Orest Boyko, Tanveer Syeda-Mahmood

Abstract: With the emergence of multimodal electronic health records, the evidence for an outcome may be captured across multiple modalities ranging from clinical to imaging and genomic data. Predicting outcomes effectively requires fusion frameworks capable of modeling fine-grained and multi-faceted complex interactions between modality features within and across patients. We develop an innovative fusion a… ▽ More With the emergence of multimodal electronic health records, the evidence for an outcome may be captured across multiple modalities ranging from clinical to imaging and genomic data. Predicting outcomes effectively requires fusion frameworks capable of modeling fine-grained and multi-faceted complex interactions between modality features within and across patients. We develop an innovative fusion approach called MaxCorr MGNN that models non-linear modality correlations within and across patients through Hirschfeld-Gebelein-Renyi maximal correlation (MaxCorr) embeddings, resulting in a multi-layered graph that preserves the identities of the modalities and patients. We then design, for the first time, a generalized multi-layered graph neural network (MGNN) for task-informed reasoning in multi-layered graphs, that learns the parameters defining patient-modality graph connectivity and message passing in an end-to-end fashion. We evaluate our model an outcome prediction task on a Tuberculosis (TB) dataset consistently outperforming several state-of-the-art neural, graph-based and traditional fusion techniques. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: To appear in ML4MHD workshop at ICML 2023

arXiv:2302.13069 [pdf]

Medical visual question answering using joint self-supervised learning

Authors: Yuan Zhou, Jing Mei, Yiqin Yu, Tanveer Syeda-Mahmood

Abstract: Visual Question Answering (VQA) becomes one of the most active research problems in the medical imaging domain. A well-known VQA challenge is the intrinsic diversity between the image and text modalities, and in the medical VQA task, there is another critical problem relying on the limited size of labelled image-question-answer data. In this study we propose an encoder-decoder framework that lever… ▽ More Visual Question Answering (VQA) becomes one of the most active research problems in the medical imaging domain. A well-known VQA challenge is the intrinsic diversity between the image and text modalities, and in the medical VQA task, there is another critical problem relying on the limited size of labelled image-question-answer data. In this study we propose an encoder-decoder framework that leverages the image-text joint representation learned from large-scaled medical image-caption data and adapted to the small-sized medical VQA task. The encoder embeds across the image-text dual modalities with self-attention mechanism and is independently pre-trained on the large-scaled medical image-caption dataset by multiple self-supervised learning tasks. Then the decoder is connected to the top of the encoder and fine-tuned using the small-sized medical VQA dataset. The experiment results present that our proposed method achieves better performance comparing with the baseline and SOTA methods. △ Less

Submitted 25 February, 2023; originally announced February 2023.

arXiv:2301.02916 [pdf, other]

Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphology

Authors: Rodrigo Bonazzola, Enzo Ferrante, Nishant Ravikumar, Yan Xia, Bernard Keavney, Sven Plein, Tanveer Syeda-Mahmood, Alejandro F Frangi

Abstract: Recent genome-wide association studies (GWAS) have been successful in identifying associations between genetic variants and simple cardiac parameters derived from cardiac magnetic resonance (CMR) images. However, the emergence of big databases including genetic data linked to CMR, facilitates investigation of more nuanced patterns of shape variability. Here, we propose a new framework for gene dis… ▽ More Recent genome-wide association studies (GWAS) have been successful in identifying associations between genetic variants and simple cardiac parameters derived from cardiac magnetic resonance (CMR) images. However, the emergence of big databases including genetic data linked to CMR, facilitates investigation of more nuanced patterns of shape variability. Here, we propose a new framework for gene discovery entitled Unsupervised Phenotype Ensembles (UPE). UPE builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner, using deep learning models trained with different hyperparameters. These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations across the ensemble. We apply our approach to the UK Biobank database to extract left-ventricular (LV) geometric features from image-derived three-dimensional meshes. We demonstrate that our approach greatly improves the discoverability of genes influencing LV shape, identifying 11 loci with study-wide significance and 8 with suggestive significance. We argue that our approach would enable more extensive discovery of gene associations with image-derived phenotypes for other organs or image modalities. △ Less

Submitted 7 January, 2023; originally announced January 2023.

Comments: 14 pages of main text, 22 pages of supplemental information

arXiv:2211.11749 [pdf]

Towards Automatic Prediction of Outcome in Treatment of Cerebral Aneurysms

Authors: Ashutosh Jadhav, Satyananda Kashyap, Hakan Bulu, Ronak Dholakia, Amon Y. Liu, Tanveer Syeda-Mahmood, William R. Patterson, Hussain Rangwala, Mehdi Moradi

Abstract: Intrasaccular flow disruptors treat cerebral aneurysms by diverting the blood flow from the aneurysm sac. Residual flow into the sac after the intervention is a failure that could be due to the use of an undersized device, or to vascular anatomy and clinical condition of the patient. We report a machine learning model based on over 100 clinical and imaging features that predict the outcome of wide… ▽ More Intrasaccular flow disruptors treat cerebral aneurysms by diverting the blood flow from the aneurysm sac. Residual flow into the sac after the intervention is a failure that could be due to the use of an undersized device, or to vascular anatomy and clinical condition of the patient. We report a machine learning model based on over 100 clinical and imaging features that predict the outcome of wide-neck bifurcation aneurysm treatment with an intravascular embolization device. We combine clinical features with a diverse set of common and novel imaging measurements within a random forest model. We also develop neural network segmentation algorithms in 2D and 3D to contour the sac in angiographic images and automatically calculate the imaging features. These deliver 90% overlap with manual contouring in 2D and 83% in 3D. Our predictive model classifies complete vs. partial occlusion outcomes with an accuracy of 75.31%, and weighted F1-score of 0.74. △ Less

Submitted 18 November, 2022; originally announced November 2022.

Comments: 10 pages

Report number: https://s4.goeshow.com/amia/annual/2022/schedule_at_a_glance.cfm?session_key=1965BCBD-A832-92DD-9D05-FB2CB132FADB&session_date=

Journal ref: AMAI 2022 Annual Symposium

arXiv:2210.14377 [pdf, other]

doi 10.1007/978-3-031-16449-1_28

Fusing Modalities by Multiplexed Graph Neural Networks for Outcome Prediction in Tuberculosis

Authors: Niharika S. D'Souza, Hongzhi Wang, Andrea Giovannini, Antonio Foncubierta-Rodriguez, Kristen L. Beck, Orest Boyko, Tanveer Syeda-Mahmood

Abstract: In a complex disease such as tuberculosis, the evidence for the disease and its evolution may be present in multiple modalities such as clinical, genomic, or imaging data. Effective patient-tailored outcome prediction and therapeutic guidance will require fusing evidence from these modalities. Such multimodal fusion is difficult since the evidence for the disease may not be uniform across all moda… ▽ More In a complex disease such as tuberculosis, the evidence for the disease and its evolution may be present in multiple modalities such as clinical, genomic, or imaging data. Effective patient-tailored outcome prediction and therapeutic guidance will require fusing evidence from these modalities. Such multimodal fusion is difficult since the evidence for the disease may not be uniform across all modalities, not all modality features may be relevant, or not all modalities may be present for all patients. All these nuances make simple methods of early, late, or intermediate fusion of features inadequate for outcome prediction. In this paper, we present a novel fusion framework using multiplexed graphs and derive a new graph neural network for learning from such graphs. Specifically, the framework allows modalities to be represented through their targeted encodings, and models their relationship explicitly via multiplexed graphs derived from salient features in a combined latent space. We present results that show that our proposed method outperforms state-of-the-art methods of fusing modalities for multi-outcome prediction on a large Tuberculosis (TB) dataset. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: Accepted into MICCAI 2022

arXiv:2204.09676 [pdf, other]

Spatially-Preserving Flattening for Location-Aware Classification of Findings in Chest X-Rays

Authors: Neha Srivathsa, Razi Mahmood, Tanveer Syeda-Mahmood

Abstract: Chest X-rays have become the focus of vigorous deep learning research in recent years due to the availability of large labeled datasets. While classification of anomalous findings is now possible, ensuring that they are correctly localized still remains challenging, as this requires recognition of anomalies within anatomical regions. Existing deep learning networks for fine-grained anomaly classif… ▽ More Chest X-rays have become the focus of vigorous deep learning research in recent years due to the availability of large labeled datasets. While classification of anomalous findings is now possible, ensuring that they are correctly localized still remains challenging, as this requires recognition of anomalies within anatomical regions. Existing deep learning networks for fine-grained anomaly classification learn location-specific findings using architectures where the location and spatial contiguity information is lost during the flattening step before classification. In this paper, we present a new spatially preserving deep learning network that preserves location and shape information through auto-encoding of feature maps during flattening. The feature maps, auto-encoder and classifier are then trained in an end-to-end fashion to enable location aware classification of findings in chest X-rays. Results are shown on a large multi-hospital chest X-ray dataset indicating a significant improvement in the quality of finding classification over state-of-the-art methods. △ Less

Submitted 19 April, 2022; originally announced April 2022.

Comments: 5 pages, Paper presented as a poster at the IEEE International Symposium on Biomedical Imaging, 2022, Paper Number 828

Journal ref: 2022 IEEE International Symposium on Biomedical Imaging (ISBI)

arXiv:2112.05254 [pdf, other]

Addressing Deep Learning Model Uncertainty in Long-Range Climate Forecasting with Late Fusion

Authors: Ken C. L. Wong, Hongzhi Wang, Etienne E. Vos, Bianca Zadrozny, Campbell D. Watson, Tanveer Syeda-Mahmood

Abstract: Global warming leads to the increase in frequency and intensity of climate extremes that cause tremendous loss of lives and property. Accurate long-range climate prediction allows more time for preparation and disaster risk management for such extreme events. Although machine learning approaches have shown promising results in long-range climate forecasting, the associated model uncertainties may… ▽ More Global warming leads to the increase in frequency and intensity of climate extremes that cause tremendous loss of lives and property. Accurate long-range climate prediction allows more time for preparation and disaster risk management for such extreme events. Although machine learning approaches have shown promising results in long-range climate forecasting, the associated model uncertainties may reduce their reliability. To address this issue, we propose a late fusion approach that systematically combines the predictions from multiple models to reduce the expected errors of the fused results. We also propose a network architecture with the novel denormalization layer to gain the benefits of data normalization without actually normalizing the data. The experimental results on long-range 2m temperature forecasting show that the framework outperforms the 30-year climate normals, and the accuracy can be improved by increasing the number of models. △ Less

Submitted 9 December, 2021; originally announced December 2021.

Comments: Accepted by the NeurIPS 2021 Workshop on Tackling Climate Change with Machine Learning

arXiv:2111.13987 [pdf, other]

Multi-modality fusion using canonical correlation analysis methods: Application in breast cancer survival prediction from histology and genomics

Authors: Vaishnavi Subramanian, Tanveer Syeda-Mahmood, Minh N. Do

Abstract: The availability of multi-modality datasets provides a unique opportunity to characterize the same object of interest using multiple viewpoints more comprehensively. In this work, we investigate the use of canonical correlation analysis (CCA) and penalized variants of CCA (pCCA) for the fusion of two modalities. We study a simple graphical model for the generation of two-modality data. We analytic… ▽ More The availability of multi-modality datasets provides a unique opportunity to characterize the same object of interest using multiple viewpoints more comprehensively. In this work, we investigate the use of canonical correlation analysis (CCA) and penalized variants of CCA (pCCA) for the fusion of two modalities. We study a simple graphical model for the generation of two-modality data. We analytically show that, with known model parameters, posterior mean estimators that jointly use both modalities outperform arbitrary linear mixing of single modality posterior estimators in latent variable prediction. Penalized extensions of CCA (pCCA) that incorporate domain knowledge can discover correlations with high-dimensional, low-sample data, whereas traditional CCA is inapplicable. To facilitate the generation of multi-dimensional embeddings with pCCA, we propose two matrix deflation schemes that enforce desirable properties exhibited by CCA. We propose a two-stage prediction pipeline using pCCA embeddings generated with deflation for latent variable prediction by combining all the above. On simulated data, our proposed model drastically reduces the mean-squared error in latent variable prediction. When applied to publicly available histopathology data and RNA-sequencing data from The Cancer Genome Atlas (TCGA) breast cancer patients, our model can outperform principal components analysis (PCA) embeddings of the same dimension in survival prediction. △ Less

Submitted 27 November, 2021; originally announced November 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2103.12245 [pdf, other]

doi 10.1117/12.2582191

Multiview and Multiclass Image Segmentation using Deep Learning in Fetal Echocardiography

Authors: Ken C. L. Wong, Elena S. Sinkovskaya, Alfred Z. Abuhamad, Tanveer Syeda-Mahmood

Abstract: Congenital heart disease (CHD) is the most common congenital abnormality associated with birth defects in the United States. Despite training efforts and substantial advancement in ultrasound technology over the past years, CHD remains an abnormality that is frequently missed during prenatal ultrasonography. Therefore, computer-aided detection of CHD can play a critical role in prenatal care by im… ▽ More Congenital heart disease (CHD) is the most common congenital abnormality associated with birth defects in the United States. Despite training efforts and substantial advancement in ultrasound technology over the past years, CHD remains an abnormality that is frequently missed during prenatal ultrasonography. Therefore, computer-aided detection of CHD can play a critical role in prenatal care by improving screening and diagnosis. Since many CHDs involve structural abnormalities, automatic segmentation of anatomical structures is an important step in the analysis of fetal echocardiograms. While existing methods mainly focus on the four-chamber view with a small number of structures, here we present a more comprehensive deep learning segmentation framework covering 14 anatomical structures in both three-vessel trachea and four-chamber views. Specifically, our framework enhances the V-Net with spatial dropout, group normalization, and deep supervision to train a segmentation model that can be applied on both views regardless of abnormalities. By identifying the pitfall of using the Dice loss when some labels are unavailable in some images, this framework integrates information from multiple views and is robust to missing structures due to anatomical anomalies, achieving an average Dice score of 79%. △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: This paper was accepted by SPIE Medical Imaging 2021

arXiv:2103.05432 [pdf, other]

Multimodal fusion using sparse CCA for breast cancer survival prediction

Authors: Vaishnavi Subramanian, Tanveer Syeda-Mahmood, Minh N. Do

Abstract: Effective understanding of a disease such as cancer requires fusing multiple sources of information captured across physical scales by multimodal data. In this work, we propose a novel feature embedding module that derives from canonical correlation analyses to account for intra-modality and inter-modality correlations. Experiments on simulated and real data demonstrate how our proposed module can… ▽ More Effective understanding of a disease such as cancer requires fusing multiple sources of information captured across physical scales by multimodal data. In this work, we propose a novel feature embedding module that derives from canonical correlation analyses to account for intra-modality and inter-modality correlations. Experiments on simulated and real data demonstrate how our proposed module can learn well-correlated multi-dimensional embeddings. These embeddings perform competitively on one-year survival classification of TCGA-BRCA breast cancer patients, yielding average F1 scores up to 58.69% under 5-fold cross-validation. △ Less

Submitted 9 March, 2021; originally announced March 2021.

Comments: Accepted for poster presentation at International Symposium on Biomedical Imaging (ISBI) 2021. 4 pages, 1 figure, 4 tables

arXiv:2011.09517 [pdf]

Extracting and Learning Fine-Grained Labels from Chest Radiographs

Authors: Tanveer Syeda-Mahmood, Ph. D, K. C. L Wong, Ph. D, Joy T. Wu, M. D., M. P. H, Ashutosh Jadhav, Ph. D, Orest Boyko, M. D. Ph. D

Abstract: Chest radiographs are the most common diagnostic exam in emergency rooms and intensive care units today. Recently, a number of researchers have begun working on large chest X-ray datasets to develop deep learning models for recognition of a handful of coarse finding classes such as opacities, masses and nodules. In this paper, we focus on extracting and learning fine-grained labels for chest X-ray… ▽ More Chest radiographs are the most common diagnostic exam in emergency rooms and intensive care units today. Recently, a number of researchers have begun working on large chest X-ray datasets to develop deep learning models for recognition of a handful of coarse finding classes such as opacities, masses and nodules. In this paper, we focus on extracting and learning fine-grained labels for chest X-ray images. Specifically we develop a new method of extracting fine-grained labels from radiology reports by combining vocabulary-driven concept extraction with phrasal grouping in dependency parse trees for association of modifiers with findings. A total of 457 fine-grained labels depicting the largest spectrum of findings to date were selected and sufficiently large datasets acquired to train a new deep learning model designed for fine-grained classification. We show results that indicate a highly accurate label extraction process and a reliable learning of fine-grained labels. The resulting network, to our knowledge, is the first to recognize fine-grained descriptions of findings in images covering over nine modifiers including laterality, location, severity, size and appearance. △ Less

Submitted 18 November, 2020; originally announced November 2020.

Comments: This paper won the Homer R. Warner Award at AMIA 2020 awarded to a paper that best describes approaches to improving computerized information acquisition, knowledge data acquisition and management, and experimental results documenting the value of these approaches. The paper shows a combination of textual and visual processing to automatically recognize complex findings in chest X-rays

arXiv:2009.06082 [pdf]

doi 10.5220/0008984901780186

Receptivity of an AI Cognitive Assistant by the Radiology Community: A Report on Data Collected at RSNA

Authors: Karina Kanjaria, Anup Pillai, Chaitanya Shivade, Marina Bendersky, Ashutosh Jadhav, Vandana Mukherjee, Tanveer Syeda-Mahmood

Abstract: Due to advances in machine learning and artificial intelligence (AI), a new role is emerging for machines as intelligent assistants to radiologists in their clinical workflows. But what systematic clinical thought processes are these machines using? Are they similar enough to those of radiologists to be trusted as assistants? A live demonstration of such a technology was conducted at the 2016 Scie… ▽ More Due to advances in machine learning and artificial intelligence (AI), a new role is emerging for machines as intelligent assistants to radiologists in their clinical workflows. But what systematic clinical thought processes are these machines using? Are they similar enough to those of radiologists to be trusted as assistants? A live demonstration of such a technology was conducted at the 2016 Scientific Assembly and Annual Meeting of the Radiological Society of North America (RSNA). The demonstration was presented in the form of a question-answering system that took a radiology multiple choice question and a medical image as inputs. The AI system then demonstrated a cognitive workflow, involving text analysis, image analysis, and reasoning, to process the question and generate the most probable answer. A post demonstration survey was made available to the participants who experienced the demo and tested the question answering system. Of the reported 54,037 meeting registrants, 2,927 visited the demonstration booth, 1,991 experienced the demo, and 1,025 completed a post-demonstration survey. In this paper, the methodology of the survey is shown and a summary of its results are presented. The results of the survey show a very high level of receptiveness to cognitive computing technology and artificial intelligence among radiologists. △ Less

Submitted 13 September, 2020; originally announced September 2020.

Journal ref: Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, ISBN 978-989-758-398-8, pages 178-186. 2020

arXiv:2008.00363 [pdf, other]

doi 10.1109/ISBI45749.2020.9098370

Looking in the Right place for Anomalies: Explainable AI through Automatic Location Learning

Authors: Satyananda Kashyap, Alexandros Karargyris, Joy Wu, Yaniv Gur, Arjun Sharma, Ken C. L. Wong, Mehdi Moradi, Tanveer Syeda-Mahmood

Abstract: Deep learning has now become the de facto approach to the recognition of anomalies in medical imaging. Their 'black box' way of classifying medical images into anomaly labels poses problems for their acceptance, particularly with clinicians. Current explainable AI methods offer justifications through visualizations such as heat maps but cannot guarantee that the network is focusing on the relevant… ▽ More Deep learning has now become the de facto approach to the recognition of anomalies in medical imaging. Their 'black box' way of classifying medical images into anomaly labels poses problems for their acceptance, particularly with clinicians. Current explainable AI methods offer justifications through visualizations such as heat maps but cannot guarantee that the network is focusing on the relevant image region fully containing the anomaly. In this paper, we develop an approach to explainable AI in which the anomaly is assured to be overlapping the expected location when present. This is made possible by automatically extracting location-specific labels from textual reports and learning the association of expected locations to labels using a hybrid combination of Bi-Directional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM) and DenseNet-121. Use of this expected location to bias the subsequent attention-guided inference network based on ResNet101 results in the isolation of the anomaly at the expected location when present. The method is evaluated on a large chest X-ray dataset. △ Less

Submitted 1 August, 2020; originally announced August 2020.

Comments: 5 pages, Paper presented as a poster at the International Symposium on Biomedical Imaging, 2020, Paper Number 655

Journal ref: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)

arXiv:2007.13831 [pdf, other]

Chest X-ray Report Generation through Fine-Grained Label Learning

Authors: Tanveer Syeda-Mahmood, Ken C. L. Wong, Yaniv Gur, Joy T. Wu, Ashutosh Jadhav, Satyananda Kashyap, Alexandros Karargyris, Anup Pillai, Arjun Sharma, Ali Bin Syed, Orest Boyko, Mehdi Moradi

Abstract: Obtaining automated preliminary read reports for common exams such as chest X-rays will expedite clinical workflows and improve operational efficiencies in hospitals. However, the quality of reports generated by current automated approaches is not yet clinically acceptable as they cannot ensure the correct detection of a broad spectrum of radiographic findings nor describe them accurately in terms… ▽ More Obtaining automated preliminary read reports for common exams such as chest X-rays will expedite clinical workflows and improve operational efficiencies in hospitals. However, the quality of reports generated by current automated approaches is not yet clinically acceptable as they cannot ensure the correct detection of a broad spectrum of radiographic findings nor describe them accurately in terms of laterality, anatomical location, severity, etc. In this work, we present a domain-aware automatic chest X-ray radiology report generation algorithm that learns fine-grained description of findings from images and uses their pattern of occurrences to retrieve and customize similar reports from a large report database. We also develop an automatic labeling algorithm for assigning such descriptors to images and build a novel deep learning network that recognizes both coarse and fine-grained descriptions of findings. The resulting report generation algorithm significantly outperforms the state of the art using established score metrics. △ Less

Submitted 27 July, 2020; originally announced July 2020.

Comments: 11 pages, 5 figures, to appear in MICCAI 2020 Conference

ACM Class: I.2.1; I.4.9; J.3

arXiv:2004.09673 [pdf, other]

Neural Network Segmentation of Cell Ultrastructure Using Incomplete Annotation

Authors: John Paul Francis, Hongzhi Wang, Kate White, Tanveer Syeda-Mahmood, Raymond Stevens

Abstract: The Pancreatic beta cell is an important target in diabetes research. For scalable modeling of beta cell ultrastructure, we investigate automatic segmentation of whole cell imaging data acquired through soft X-ray tomography. During the course of the study, both complete and partial ultrastructure annotations were produced manually for different subsets of the data. To more effectively use existin… ▽ More The Pancreatic beta cell is an important target in diabetes research. For scalable modeling of beta cell ultrastructure, we investigate automatic segmentation of whole cell imaging data acquired through soft X-ray tomography. During the course of the study, both complete and partial ultrastructure annotations were produced manually for different subsets of the data. To more effectively use existing annotations, we propose a method that enables the application of partially labeled data for full label segmentation. For experimental validation, we apply our method to train a convolutional neural network with a set of 12 fully annotated data and 12 partially annotated data and show promising improvement over standard training that uses fully annotated data alone. △ Less

Submitted 20 April, 2020; originally announced April 2020.

arXiv:2002.01982 [pdf, other]

Multimodal fusion of imaging and genomics for lung cancer recurrence prediction

Authors: Vaishnavi Subramanian, Minh N. Do, Tanveer Syeda-Mahmood

Abstract: Lung cancer has a high rate of recurrence in early-stage patients. Predicting the post-surgical recurrence in lung cancer patients has traditionally been approached using single modality information of genomics or radiology images. We investigate the potential of multimodal fusion for this task. By combining computed tomography (CT) images and genomics, we demonstrate improved prediction of recurr… ▽ More Lung cancer has a high rate of recurrence in early-stage patients. Predicting the post-surgical recurrence in lung cancer patients has traditionally been approached using single modality information of genomics or radiology images. We investigate the potential of multimodal fusion for this task. By combining computed tomography (CT) images and genomics, we demonstrate improved prediction of recurrence using linear Cox proportional hazards models with elastic net regularization. We work on a recent non-small cell lung cancer (NSCLC) radiogenomics dataset of 130 patients and observe an increase in concordance-index values of up to 10%. Employing non-linear methods from the neural network literature, such as multi-layer perceptrons and visual-question answering fusion modules, did not improve performance consistently. This indicates the need for larger multimodal datasets and fusion techniques better adapted to this biological setting. △ Less

Submitted 5 February, 2020; originally announced February 2020.

Comments: Accepted for presentation at International Symposium on Biomedical Imaging (ISBI) 2020 (Iowa City). 5 pages, last page references

arXiv:1907.01656 [pdf, other]

Automated Detection and Type Classification of Central Venous Catheters in Chest X-Rays

Authors: Vaishnavi Subramanian, Hongzhi Wang, Joy T. Wu, Ken C. L. Wong, Arjun Sharma, Tanveer Syeda-Mahmood

Abstract: Central venous catheters (CVCs) are commonly used in critical care settings for monitoring body functions and administering medications. They are often described in radiology reports by referring to their presence, identity and placement. In this paper, we address the problem of automatic detection of their presence and identity through automated segmentation using deep learning networks and class… ▽ More Central venous catheters (CVCs) are commonly used in critical care settings for monitoring body functions and administering medications. They are often described in radiology reports by referring to their presence, identity and placement. In this paper, we address the problem of automatic detection of their presence and identity through automated segmentation using deep learning networks and classification based on their intersection with previously learned shape priors from clinician annotations of CVCs. The results not only outperform existing methods of catheter detection achieving 85.2% accuracy at 91.6% precision, but also enable high precision (95.2%) classification of catheter types on a large dataset of over 10,000 chest X-rays, presenting a robust and practical solution to this problem. △ Less

Submitted 25 July, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

Comments: Accepted to Medical Image Computing and Computer Assisted Intervention (MICCAI) 2019; Data available: ML-CDS Challenge, MICCAI2019 (http://www.mcbr-cds.org/challenge/challenge-description.html)

arXiv:1906.09354 [pdf, other]

Boosting the rule-out accuracy of deep disease detection using class weight modifiers

Authors: Alexandros Karargyris, Ken C. L. Wong, Joy T. Wu, Mehdi Moradi, Tanveer Syeda-Mahmood

Abstract: In many screening applications, the primary goal of a radiologist or assisting artificial intelligence is to rule out certain findings. The classifiers built for such applications are often trained on large datasets that derive labels from clinical notes written for patients. While the quality of the positive findings described in these notes is often reliable, lack of the mention of a finding doe… ▽ More In many screening applications, the primary goal of a radiologist or assisting artificial intelligence is to rule out certain findings. The classifiers built for such applications are often trained on large datasets that derive labels from clinical notes written for patients. While the quality of the positive findings described in these notes is often reliable, lack of the mention of a finding does not always rule out the presence of it. This happens because radiologists comment on the patient in the context of the exam, for example focusing on trauma as opposed to chronic disease at emergency rooms. However, this disease finding ambiguity can affect the performance of algorithms. Hence it is critical to model the ambiguity during training. We propose a scheme to apply reasonable class weight modifiers to our loss function for the no mention cases during training. We experiment with two different deep neural network architectures and show that the proposed method results in a large improvement in the performance of the classifiers, specially on negated findings. The baseline performance of a custom-made dilated block network proposed in this paper shows an improvement in comparison with baseline DenseNet-201, while both architectures benefit from the new proposed loss function weighting scheme. Over 200,000 chest X-ray images and three highly common diseases, along with their negated counterparts, are included in this study. △ Less

Submitted 21 June, 2019; originally announced June 2019.

Comments: This paper was accepted by the IEEE International Symposium on Biomedical Imaging (ISBI) 2019

arXiv:1906.09336 [pdf, other]

Building a Benchmark Dataset and Classifiers for Sentence-Level Findings in AP Chest X-rays

Authors: Tanveer Syeda-Mahmood, Hassan M. Ahmad, Nadeem Ansari, Yaniv Gur, Satyananda Kashyap, Alexandros Karargyris, Mehdi Moradi, Anup Pillai, Karthik Sheshadri, Weiting Wang, Ken C. L. Wong, Joy T. Wu

Abstract: Chest X-rays are the most common diagnostic exams in emergency rooms and hospitals. There has been a surge of work on automatic interpretation of chest X-rays using deep learning approaches after the availability of large open source chest X-ray dataset from NIH. However, the labels are not sufficiently rich and descriptive for training classification tools. Further, it does not adequately address… ▽ More Chest X-rays are the most common diagnostic exams in emergency rooms and hospitals. There has been a surge of work on automatic interpretation of chest X-rays using deep learning approaches after the availability of large open source chest X-ray dataset from NIH. However, the labels are not sufficiently rich and descriptive for training classification tools. Further, it does not adequately address the findings seen in Chest X-rays taken in anterior-posterior (AP) view which also depict the placement of devices such as central vascular lines and tubes. In this paper, we present a new chest X-ray benchmark database of 73 rich sentence-level descriptors of findings seen in AP chest X-rays. We describe our method of obtaining these findings through a semi-automated ground truth generation process from crowdsourcing of clinician annotations. We also present results of building classifiers for these findings that show that such higher granularity labels can also be learned through the framework of deep learning classifiers. △ Less

Submitted 21 June, 2019; originally announced June 2019.

Comments: This paper was accepted by the IEEE International Symposium on Biomedical Imaging (ISBI) 2019

arXiv:1904.01654 [pdf, other]

doi 10.1117/12.2513164

Identifying disease-free chest X-ray images with deep transfer learning

Authors: Ken C. L. Wong, Mehdi Moradi, Joy Wu, Tanveer Syeda-Mahmood

Abstract: Chest X-rays (CXRs) are among the most commonly used medical image modalities. They are mostly used for screening, and an indication of disease typically results in subsequent tests. As this is mostly a screening test used to rule out chest abnormalities, the requesting clinicians are often interested in whether a CXR is normal or not. A machine learning algorithm that can accurately screen out ev… ▽ More Chest X-rays (CXRs) are among the most commonly used medical image modalities. They are mostly used for screening, and an indication of disease typically results in subsequent tests. As this is mostly a screening test used to rule out chest abnormalities, the requesting clinicians are often interested in whether a CXR is normal or not. A machine learning algorithm that can accurately screen out even a small proportion of the "real normal" exams out of all requested CXRs would be highly beneficial in reducing the workload for radiologists. In this work, we report a deep neural network trained for classifying CXRs with the goal of identifying a large number of normal (disease-free) images without risking the discharge of sick patients. We use an ImageNet-pretrained Inception-ResNet-v2 model to provide the image features, which are further used to train a model on CXRs labelled by expert radiologists. The probability threshold for classification is optimized for 100% precision for the normal class, ensuring no sick patients are released. At this threshold we report an average recall of 50%. This means that the proposed solution has the potential to cut in half the number of disease-free CXRs examined by radiologists, without risking the discharge of sick patients. △ Less

Submitted 2 April, 2019; originally announced April 2019.

Comments: SPIE Medical Imaging, 2019 (oral presentation)

arXiv:1903.06542 [pdf]

Age prediction using a large chest X-ray dataset

Authors: Alexandros Karargyris, Satyananda Kashyap, Joy T Wu, Arjun Sharma, Mehdi Moradi, Tanveer Syeda-Mahmood

Abstract: Age prediction based on appearances of different anatomies in medical images has been clinically explored for many decades. In this paper, we used deep learning to predict a persons age on Chest X-Rays. Specifically, we trained a CNN in regression fashion on a large publicly available dataset. Moreover, for interpretability, we explored activation maps to identify which areas of a CXR image are im… ▽ More Age prediction based on appearances of different anatomies in medical images has been clinically explored for many decades. In this paper, we used deep learning to predict a persons age on Chest X-Rays. Specifically, we trained a CNN in regression fashion on a large publicly available dataset. Moreover, for interpretability, we explored activation maps to identify which areas of a CXR image are important for the machine (i.e. CNN) to predict a patients age, offering insight. Overall, amongst correctly predicted CXRs, we see areas near the clavicles, shoulders, spine, and mediastinum being most activated for age prediction, as one would expect biologically. Amongst incorrectly predicted CXRs, we have qualitatively identified disease patterns that could possibly make the anatomies appear older or younger than expected. A further technical and clinical evaluation would improve this work. As CXR is the most commonly requested imaging exam, a potential use case for estimating age may be found in the preventative counseling of patient health status compared to their age-expected average, particularly when there is a large discrepancy between predicted age and the real patient age. △ Less

Submitted 8 March, 2019; originally announced March 2019.

Comments: Presented at SPIE Medical Imaging Conference, San Diego, 2019

arXiv:1812.10818 [pdf, other]

Classification of radiology reports by modality and anatomy: A comparative study

Authors: Marina Bendersky, Joy Wu, Tanveer Syeda-Mahmood

Abstract: Data labeling is currently a time-consuming task that often requires expert knowledge. In research settings, the availability of correctly labeled data is crucial to ensure that model predictions are accurate and useful. We propose relatively simple machine learning-based models that achieve high performance metrics in the binary and multiclass classification of radiology reports. We compare the p… ▽ More Data labeling is currently a time-consuming task that often requires expert knowledge. In research settings, the availability of correctly labeled data is crucial to ensure that model predictions are accurate and useful. We propose relatively simple machine learning-based models that achieve high performance metrics in the binary and multiclass classification of radiology reports. We compare the performance of these algorithms to that of a data-driven approach based on NLP, and find that the logistic regression classifier outperforms all other models, in both the binary and multiclass classification tasks. We then choose the logistic regression binary classifier to predict chest X-ray (CXR)/ non-chest X-ray (non-CXR) labels in reports from different datasets, unseen during any training phase of any of the models. Even in unseen report collections, the binary logistic regression classifier achieves average precision values of above 0.9. Based on the regression coefficient values, we also identify frequent tokens in CXR and non-CXR reports that are features with possibly high predictive power. △ Less

Submitted 27 December, 2018; originally announced December 2018.

Comments: 8 pages, 4 figures, BIBM 2018

arXiv:1809.01610 [pdf, other]

Bimodal network architectures for automatic generation of image annotation from text

Authors: Mehdi Moradi, Ali Madani, Yaniv Gur, Yufan Guo, Tanveer Syeda-Mahmood

Abstract: Medical image analysis practitioners have embraced big data methodologies. This has created a need for large annotated datasets. The source of big data is typically large image collections and clinical reports recorded for these images. In many cases, however, building algorithms aimed at segmentation and detection of disease requires a training dataset with markings of the areas of interest on th… ▽ More Medical image analysis practitioners have embraced big data methodologies. This has created a need for large annotated datasets. The source of big data is typically large image collections and clinical reports recorded for these images. In many cases, however, building algorithms aimed at segmentation and detection of disease requires a training dataset with markings of the areas of interest on the image that match with the described anomalies. This process of annotation is expensive and needs the involvement of clinicians. In this work we propose two separate deep neural network architectures for automatic marking of a region of interest (ROI) on the image best representing a finding location, given a textual report or a set of keywords. One architecture consists of LSTM and CNN components and is trained end to end with images, matching text, and markings of ROIs for those images. The output layer estimates the coordinates of the vertices of a polygonal region. The second architecture uses a network pre-trained on a large dataset of the same image types for learning feature representations of the findings of interest. We show that for a variety of findings from chest X-ray images, both proposed architectures learn to estimate the ROI, as validated by clinical annotations. There is a clear advantage obtained from the architecture with pre-trained imaging network. The centroids of the ROIs marked by this network were on average at a distance equivalent to 5.1% of the image width from the centroids of the ground truth ROIs. △ Less

Submitted 5 September, 2018; originally announced September 2018.

Comments: Accepted to MICCAI 2018, LNCS 11070

Journal ref: Lecture Notes in Computer Science (LNCS 11070), Proceedings of Medical Image Computing & Computer Assisted Intervention (MICCAI 2018)

arXiv:1809.00076 [pdf, other]

doi 10.1007/978-3-030-00931-1_70

3D Segmentation with Exponential Logarithmic Loss for Highly Unbalanced Object Sizes

Authors: Ken C. L. Wong, Mehdi Moradi, Hui Tang, Tanveer Syeda-Mahmood

Abstract: With the introduction of fully convolutional neural networks, deep learning has raised the benchmark for medical image segmentation on both speed and accuracy, and different networks have been proposed for 2D and 3D segmentation with promising results. Nevertheless, most networks only handle relatively small numbers of labels (<10), and there are very limited works on handling highly unbalanced ob… ▽ More With the introduction of fully convolutional neural networks, deep learning has raised the benchmark for medical image segmentation on both speed and accuracy, and different networks have been proposed for 2D and 3D segmentation with promising results. Nevertheless, most networks only handle relatively small numbers of labels (<10), and there are very limited works on handling highly unbalanced object sizes especially in 3D segmentation. In this paper, we propose a network architecture and the corresponding loss function which improve segmentation of very small structures. By combining skip connections and deep supervision with respect to the computational feasibility of 3D segmentation, we propose a fast converging and computationally efficient network architecture for accurate segmentation. Furthermore, inspired by the concept of focal loss, we propose an exponential logarithmic loss which balances the labels not only by their relative sizes but also by their segmentation difficulties. We achieve an average Dice coefficient of 82% on brain segmentation with 20 labels, with the ratio of the smallest to largest object sizes as 0.14%. Less than 100 epochs are required to reach such accuracy, and segmenting a 128x128x128 volume only takes around 0.4 s. △ Less

Submitted 24 September, 2018; v1 submitted 31 August, 2018; originally announced September 2018.

Comments: Accepted by the International Conference on Medical Image Computing and Computer-Assisted Intervention - MICCAI 2018 (oral presentation, acceptance rate 4%)

arXiv:1808.05205 [pdf, other]

doi 10.1016/j.media.2018.07.010

Building medical image classifiers with very limited data using segmentation networks

Authors: Ken C. L. Wong, Tanveer Syeda-Mahmood, Mehdi Moradi

Abstract: Deep learning has shown promising results in medical image analysis, however, the lack of very large annotated datasets confines its full potential. Although transfer learning with ImageNet pre-trained classification models can alleviate the problem, constrained image sizes and model complexities can lead to unnecessary increase in computational cost and decrease in performance. As many common mor… ▽ More Deep learning has shown promising results in medical image analysis, however, the lack of very large annotated datasets confines its full potential. Although transfer learning with ImageNet pre-trained classification models can alleviate the problem, constrained image sizes and model complexities can lead to unnecessary increase in computational cost and decrease in performance. As many common morphological features are usually shared by different classification tasks of an organ, it is greatly beneficial if we can extract such features to improve classification with limited samples. Therefore, inspired by the idea of curriculum learning, we propose a strategy for building medical image classifiers using features from segmentation networks. By using a segmentation network pre-trained on similar data as the classification task, the machine can first learn the simpler shape and structural concepts before tackling the actual classification problem which usually involves more complicated concepts. Using our proposed framework on a 3D three-class brain tumor type classification problem, we achieved 82% accuracy on 191 testing samples with 91 training samples. When applying to a 2D nine-class cardiac semantic level classification problem, we achieved 86% accuracy on 263 testing samples with 108 training samples. Comparisons with ImageNet pre-trained classifiers and classifiers trained from scratch are presented. △ Less

Submitted 15 August, 2018; originally announced August 2018.

Comments: This paper was accepted by Medical Image Analysis

Journal ref: Medical Image Analysis 49 (2018) 105-116

arXiv:1805.02730 [pdf, other]

doi 10.1007/978-3-319-66179-7_54

Building Disease Detection Algorithms with Very Small Numbers of Positive Samples

Authors: Ken C. L. Wong, Alexandros Karargyris, Tanveer Syeda-Mahmood, Mehdi Moradi

Abstract: Although deep learning can provide promising results in medical image analysis, the lack of very large annotated datasets confines its full potential. Furthermore, limited positive samples also create unbalanced datasets which limit the true positive rates of trained models. As unbalanced datasets are mostly unavoidable, it is greatly beneficial if we can extract useful knowledge from negative sam… ▽ More Although deep learning can provide promising results in medical image analysis, the lack of very large annotated datasets confines its full potential. Furthermore, limited positive samples also create unbalanced datasets which limit the true positive rates of trained models. As unbalanced datasets are mostly unavoidable, it is greatly beneficial if we can extract useful knowledge from negative samples to improve classification accuracy on limited positive samples. To this end, we propose a new strategy for building medical image analysis pipelines that target disease detection. We train a discriminative segmentation model only on normal images to provide a source of knowledge to be transferred to a disease detection classifier. We show that using the feature maps of a trained segmentation network, deviations from normal anatomy can be learned by a two-class classification network on an extremely unbalanced training dataset with as little as one positive for 17 negative samples. We demonstrate that even though the segmentation network is only trained on normal cardiac computed tomography images, the resulting feature maps can be used to detect pericardial effusion and cardiac septal defects with two-class convolutional classification networks. △ Less

Submitted 7 May, 2018; originally announced May 2018.

Showing 1–31 of 31 results for author: Syeda-Mahmood, T