Search | arXiv e-print repository

A Diagnostic Model for Acute Lymphoblastic Leukemia Using Metaheuristics and Deep Learning Methods

Authors: Amir Masoud Rahmani, Parisa Khoshvaght, Hamid Alinejad-Rokny, Samira Sadeghi, Parvaneh Asghari, Zohre Arabi, Mehdi Hosseinzadeh

Abstract: Acute lymphoblastic leukemia (ALL) severity is determined by the presence and ratios of blast cells (abnormal white blood cells) in both bone marrow and peripheral blood. Manual diagnosis of this disease is a tedious and time-consuming operation, making it difficult for professionals to accurately examine blast cell characteristics. To address this difficulty, researchers use deep learning and mac… ▽ More Acute lymphoblastic leukemia (ALL) severity is determined by the presence and ratios of blast cells (abnormal white blood cells) in both bone marrow and peripheral blood. Manual diagnosis of this disease is a tedious and time-consuming operation, making it difficult for professionals to accurately examine blast cell characteristics. To address this difficulty, researchers use deep learning and machine learning. In this paper, a ResNet-based feature extractor is utilized to detect ALL, along with a variety of feature selectors and classifiers. To get the best results, a variety of transfer learning models, including the Resnet, VGG, EfficientNet, and DensNet families, are used as deep feature extractors. Following extraction, different feature selectors are used, including Genetic algorithm, PCA, ANOVA, Random Forest, Univariate, Mutual information, Lasso, XGB, Variance, and Binary ant colony. After feature qualification, a variety of classifiers are used, with MLP outperforming the others. The recommended technique is used to categorize ALL and HEM in the selected dataset which is C-NMC 2019. This technique got an impressive 90.71% accuracy and 95.76% sensitivity for the relevant classifications, and its metrics on this dataset outperformed others. △ Less

Submitted 12 August, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00702 [pdf]

Enhanced Heart Sound Classification Using Mel Frequency Cepstral Coefficients and Comparative Analysis of Single vs. Ensemble Classifier Strategies

Authors: Amir Masoud Rahmani, Amir Haider, Mohammad Adeli, Olfa Mzoughi, Entesar Gemeay, Mokhtar Mohammadi, Hamid Alinejad-Rokny, Parisa Khoshvaght, Mehdi Hosseinzadeh

Abstract: This paper explores the efficacy of Mel Frequency Cepstral Coefficients (MFCCs) in detecting abnormal heart sounds using two classification strategies: a single classifier and an ensemble classifier approach. Heart sounds were first pre-processed to remove noise and then segmented into S1, systole, S2, and diastole intervals, with thirteen MFCCs estimated from each segment, yielding 52 MFCCs per b… ▽ More This paper explores the efficacy of Mel Frequency Cepstral Coefficients (MFCCs) in detecting abnormal heart sounds using two classification strategies: a single classifier and an ensemble classifier approach. Heart sounds were first pre-processed to remove noise and then segmented into S1, systole, S2, and diastole intervals, with thirteen MFCCs estimated from each segment, yielding 52 MFCCs per beat. Finally, MFCCs were used for heart sound classification. For that purpose, in the single classifier strategy, the MFCCs from nine consecutive beats were averaged to classify heart sounds by a single classifier (either a support vector machine (SVM), the k nearest neighbors (kNN), or a decision tree (DT)). Conversely, the ensemble classifier strategy employed nine classifiers (either nine SVMs, nine kNN classifiers, or nine DTs) to individually assess beats as normal or abnormal, with the overall classification based on the majority vote. Both methods were tested on a publicly available phonocardiogram database. The heart sound classification accuracy was 91.95% for the SVM, 91.9% for the kNN, and 87.33% for the DT in the single classifier strategy. Also, the accuracy was 93.59% for the SVM, 91.84% for the kNN, and 92.22% for the DT in the ensemble classifier strategy. Overall, the results demonstrated that the ensemble classifier strategy improved the accuracies of the DT and the SVM by 4.89% and 1.64%, establishing MFCCs as more effective than other features, including time, time-frequency, and statistical features, evaluated in similar studies. △ Less

Submitted 29 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.05792 [pdf, other]

RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation

Authors: Sourav Garg, Krishan Rana, Mehdi Hosseinzadeh, Lachlan Mares, Niko Sünderhauf, Feras Dayoub, Ian Reid

Abstract: Mapping is crucial for spatial reasoning, planning and robot navigation. Existing approaches range from metric, which require precise geometry-based optimization, to purely topological, where image-as-node based graphs lack explicit object-level reasoning and interconnectivity. In this paper, we propose a novel topological representation of an environment based on "image segments", which are seman… ▽ More Mapping is crucial for spatial reasoning, planning and robot navigation. Existing approaches range from metric, which require precise geometry-based optimization, to purely topological, where image-as-node based graphs lack explicit object-level reasoning and interconnectivity. In this paper, we propose a novel topological representation of an environment based on "image segments", which are semantically meaningful and open-vocabulary queryable, conferring several advantages over previous works based on pixel-level features. Unlike 3D scene graphs, we create a purely topological graph with segments as nodes, where edges are formed by a) associating segment-level descriptors between pairs of consecutive images and b) connecting neighboring segments within an image using their pixel centroids. This unveils a "continuous sense of a place", defined by inter-image persistence of segments along with their intra-image neighbours. It further enables us to represent and update segment-level descriptors through neighborhood aggregation using graph convolution layers, which improves robot localization based on segment-level retrieval. Using real-world data, we show how our proposed map representation can be used to i) generate navigation plans in the form of "hops over segments" and ii) search for target objects using natural language queries describing spatial relations of objects. Furthermore, we quantitatively analyze data association at the segment level, which underpins inter-image connectivity during mapping and segment-level localization when revisiting the same place. Finally, we show preliminary trials on segment-level `hopping' based zero-shot real-world navigation. Project page with supplementary details: oravus.github.io/RoboHop/ △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: Published at ICRA 2024; 9 pages, 8 figures

arXiv:2404.07267 [pdf, other]

Closed-Loop Model Identification and MPC-based Navigation of Quadcopters: A Case Study of Parrot Bebop 2

Authors: Mohsen Amiri, Mehdi Hosseinzadeh

Abstract: The growing potential of quadcopters in various domains, such as aerial photography, search and rescue, and infrastructure inspection, underscores the need for real-time control under strict safety and operational constraints. This challenge is compounded by the inherent nonlinear dynamics of quadcopters and the on-board computational limitations they face. This paper aims at addressing these chal… ▽ More The growing potential of quadcopters in various domains, such as aerial photography, search and rescue, and infrastructure inspection, underscores the need for real-time control under strict safety and operational constraints. This challenge is compounded by the inherent nonlinear dynamics of quadcopters and the on-board computational limitations they face. This paper aims at addressing these challenges. First, this paper presents a comprehensive procedure for deriving a linear yet efficient model to describe the dynamics of quadrotors, thereby reducing complexity without compromising efficiency. Then, this paper develops a steady-state-aware Model Predictive Control (MPC) to effectively navigate quadcopters, while guaranteeing constraint satisfaction at all times. The main advantage of the steady-state-aware MPC is its low computational complexity, which makes it an appropriate choice for systems with limited computing capacity, like quadcopters. This paper considers Parrot Bebop 2 as the running example, and experimentally validates and evaluates the proposed algorithms. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2305.05984 [pdf, other]

Uncertainty-Aware Semi-Supervised Learning for Prostate MRI Zonal Segmentation

Authors: Matin Hosseinzadeh, Anindo Saha, Joeran Bosma, Henkjan Huisman

Abstract: Quality of deep convolutional neural network predictions strongly depends on the size of the training dataset and the quality of the annotations. Creating annotations, especially for 3D medical image segmentation, is time-consuming and requires expert knowledge. We propose a novel semi-supervised learning (SSL) approach that requires only a relatively small number of annotations while being able t… ▽ More Quality of deep convolutional neural network predictions strongly depends on the size of the training dataset and the quality of the annotations. Creating annotations, especially for 3D medical image segmentation, is time-consuming and requires expert knowledge. We propose a novel semi-supervised learning (SSL) approach that requires only a relatively small number of annotations while being able to use the remaining unlabeled data to improve model performance. Our method uses a pseudo-labeling technique that employs recent deep learning uncertainty estimation models. By using the estimated uncertainty, we were able to rank pseudo-labels and automatically select the best pseudo-annotations generated by the supervised model. We applied this to prostate zonal segmentation in T2-weighted MRI scans. Our proposed model outperformed the semi-supervised model in experiments with the ProstateX dataset and an external test set, by leveraging only a subset of unlabeled data rather than the full collection of 4953 cases, our proposed model demonstrated improved performance. The segmentation dice similarity coefficient in the transition zone and peripheral zone increased from 0.835 and 0.727 to 0.852 and 0.751, respectively, for fully supervised model and the uncertainty-aware semi-supervised learning model (USSL). Our USSL model demonstrates the potential to allow deep learning models to be trained on large datasets without requiring full annotation. Our code is available at https://github.com/DIAGNijmegen/prostateMR-USSL. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: 9 pages

arXiv:2303.15993 [pdf, other]

SELF-VS: Self-supervised Encoding Learning For Video Summarization

Authors: Hojjat Mokhtarabadi, Kave Bahraman, Mehrdad HosseinZadeh, Mahdi Eftekhari

Abstract: Despite its wide range of applications, video summarization is still held back by the scarcity of extensive datasets, largely due to the labor-intensive and costly nature of frame-level annotations. As a result, existing video summarization methods are prone to overfitting. To mitigate this challenge, we propose a novel self-supervised video representation learning method using knowledge distillat… ▽ More Despite its wide range of applications, video summarization is still held back by the scarcity of extensive datasets, largely due to the labor-intensive and costly nature of frame-level annotations. As a result, existing video summarization methods are prone to overfitting. To mitigate this challenge, we propose a novel self-supervised video representation learning method using knowledge distillation to pre-train a transformer encoder. Our method matches its semantic video representation, which is constructed with respect to frame importance scores, to a representation derived from a CNN trained on video classification. Empirical evaluations on correlation-based metrics, such as Kendall's $τ$ and Spearman's $ρ$ demonstrate the superiority of our approach compared to existing state-of-the-art methods in assigning relative scores to the input frames. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: 9 pages, 5 figures

arXiv:2301.05688 [pdf, other]

CANE: A Cascade-Control Approach for Network-Assisted Video QoE Management

Authors: Mehdi Hosseinzadeh, Karthick Shankar, Maria Apostolaki, Jay Ramachandran, Steven Adams, Vyas Sekar, Bruno Sinopoli

Abstract: Prior efforts have shown that network-assisted schemes can improve the Quality-of-Experience (QoE) and QoE fairness when multiple video players compete for bandwidth. However, realizing network-assisted schemes in practice is challenging, as: i) the network has limited visibility into the client players' internal state and actions; ii) players' actions may nullify or negate the network's actions;… ▽ More Prior efforts have shown that network-assisted schemes can improve the Quality-of-Experience (QoE) and QoE fairness when multiple video players compete for bandwidth. However, realizing network-assisted schemes in practice is challenging, as: i) the network has limited visibility into the client players' internal state and actions; ii) players' actions may nullify or negate the network's actions; and iii) the players' objectives might be conflicting. To address these challenges, we formulate network-assisted QoE optimization through a cascade control abstraction. This informs the design of CANE, a practical network-assisted QoE framework. CANE uses machine learning techniques to approximate each player's behavior as a black-box model and model predictive control to achieve a near-optimal solution. We evaluate CANE through realistic simulations and show that CANE improves multiplayer QoE fairness by ~50% compared to pure client-side adaptive bitrate algorithms and by ~20% compared to uniform traffic shaping. △ Less

Submitted 13 January, 2023; originally announced January 2023.

arXiv:2212.03176 [pdf, other]

Domain Adaptation and Generalization on Functional Medical Images: A Systematic Survey

Authors: Gita Sarafraz, Armin Behnamnia, Mehran Hosseinzadeh, Ali Balapour, Amin Meghrazi, Hamid R. Rabiee

Abstract: Machine learning algorithms have revolutionized different fields, including natural language processing, computer vision, signal processing, and medical data processing. Despite the excellent capabilities of machine learning algorithms in various tasks and areas, the performance of these models mainly deteriorates when there is a shift in the test and training data distributions. This gap occurs d… ▽ More Machine learning algorithms have revolutionized different fields, including natural language processing, computer vision, signal processing, and medical data processing. Despite the excellent capabilities of machine learning algorithms in various tasks and areas, the performance of these models mainly deteriorates when there is a shift in the test and training data distributions. This gap occurs due to the violation of the fundamental assumption that the training and test data are independent and identically distributed (i.i.d). In real-world scenarios where collecting data from all possible domains for training is costly and even impossible, the i.i.d assumption can hardly be satisfied. The problem is even more severe in the case of medical images and signals because it requires either expensive equipment or a meticulous experimentation setup to collect data, even for a single domain. Additionally, the decrease in performance may have severe consequences in the analysis of medical records. As a result of such problems, the ability to generalize and adapt under distribution shifts (domain generalization (DG) and domain adaptation (DA)) is essential for the analysis of medical data. This paper provides the first systematic review of DG and DA on functional brain signals to fill the gap of the absence of a comprehensive study in this era. We provide detailed explanations and categorizations of datasets, approaches, and architectures used in DG and DA on functional brain images. We further address the attention-worthy future tracks in this field. △ Less

Submitted 4 December, 2022; originally announced December 2022.

Comments: 41 pages, 8 figures

arXiv:2211.08628 [pdf, other]

Longitudinal Analysis of Heart Rate and Physical Activity Collected from Smartwatches

Authors: Fatemeh Karimi, Zohre Amoozgar, Reza Reiazi, Mehdi Hosseinzadeh, Reza Rawassizadeh

Abstract: Smartwatches (SWs) can continuously and autonomously monitor vital signs, including heart rates and physical activities involving wrist movement. The monitoring capability of SWs has several key health benefits arising from their role in preventive and diagnostic medicine. Current research, however, has not explored many of these opportunities, including longitudinal studies. In our work, we gathe… ▽ More Smartwatches (SWs) can continuously and autonomously monitor vital signs, including heart rates and physical activities involving wrist movement. The monitoring capability of SWs has several key health benefits arising from their role in preventive and diagnostic medicine. Current research, however, has not explored many of these opportunities, including longitudinal studies. In our work, we gathered longitudinal data points, e.g., heart rate and physical activity, from various brands of SWs worn by 1,014 users. Our analysis shows three common heart rate patterns during sleep but two common patterns during the day. We find that heart rate and physical activities are higher in summer and the first month of the new year compared to other months. Moreover, physical activities are reduced on weekends compared with weekdays. Interestingly, the highest peak of physical activity is during the evening. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 22 pages, 13 figures

arXiv:2210.09922 [pdf, other]

Few-Shot Learning of Compact Models via Task-Specific Meta Distillation

Authors: Yong Wu, Shekhor Chanda, Mehrdad Hosseinzadeh, Zhi Liu, Yang Wang

Abstract: We consider a new problem of few-shot learning of compact models. Meta-learning is a popular approach for few-shot learning. Previous work in meta-learning typically assumes that the model architecture during meta-training is the same as the model architecture used for final deployment. In this paper, we challenge this basic assumption. For final deployment, we often need the model to be small. Bu… ▽ More We consider a new problem of few-shot learning of compact models. Meta-learning is a popular approach for few-shot learning. Previous work in meta-learning typically assumes that the model architecture during meta-training is the same as the model architecture used for final deployment. In this paper, we challenge this basic assumption. For final deployment, we often need the model to be small. But small models usually do not have enough capacity to effectively adapt to new tasks. In the mean time, we often have access to the large dataset and extensive computing power during meta-training since meta-training is typically performed on a server. In this paper, we propose task-specific meta distillation that simultaneously learns two models in meta-learning: a large teacher model and a small student model. These two models are jointly learned during meta-training. Given a new task during meta-testing, the teacher model is first adapted to this task, then the adapted teacher model is used to guide the adaptation of the student model. The adapted student model is used for final deployment. We demonstrate the effectiveness of our approach in few-shot image classification using model-agnostic meta-learning (MAML). Our proposed method outperforms other alternatives on several benchmark datasets. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: This paper has been accepted by WACV'2023

arXiv:2210.07143 [pdf]

Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark

Authors: Elham Azhir, Mehdi Hosseinzadeh, Faheem Khan, Amir Mosavi

Abstract: Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method. However, traditional clustering algorithms take a significant amount of execution time for clustering such large datasets. The MapReduce distributed computing model provides efficie… ▽ More Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method. However, traditional clustering algorithms take a significant amount of execution time for clustering such large datasets. The MapReduce distributed computing model provides efficient solutions for storing and processing vast quantities of data. Apache Spark and Apache Hadoop frameworks are used in the present investigation to cluster different sizes of query datasets in the MapReduce-based access plan recommendation method. The performance evaluation is performed based on execution time. The results of the experiments demonstrated the effectiveness of parallel query clustering in achieving high scalability. Furthermore, Apache Spark achieved better performance than Apache Hadoop, reaching an average speedup of 2x. △ Less

Submitted 17 September, 2022; originally announced October 2022.

Comments: 11pages, 4 figures

MSC Class: 68T05

arXiv:2204.05782 [pdf, other]

Stochastic Multi-armed Bandits with Non-stationary Rewards Generated by a Linear Dynamical System

Authors: Jonathan Gornet, Mehdi Hosseinzadeh, Bruno Sinopoli

Abstract: The stochastic multi-armed bandit has provided a framework for studying decision-making in unknown environments. We propose a variant of the stochastic multi-armed bandit where the rewards are sampled from a stochastic linear dynamical system. The proposed strategy for this stochastic multi-armed bandit variant is to learn a model of the dynamical system while choosing the optimal action based on… ▽ More The stochastic multi-armed bandit has provided a framework for studying decision-making in unknown environments. We propose a variant of the stochastic multi-armed bandit where the rewards are sampled from a stochastic linear dynamical system. The proposed strategy for this stochastic multi-armed bandit variant is to learn a model of the dynamical system while choosing the optimal action based on the learned model. Motivated by mathematical finance areas such as Intertemporal Capital Asset Pricing Model proposed by Merton and Stochastic Portfolio Theory proposed by Fernholz that both model asset returns with stochastic differential equations, this strategy is applied to quantitative finance as a high-frequency trading strategy, where the goal is to maximize returns within a time period. △ Less

Submitted 6 April, 2022; originally announced April 2022.

arXiv:2112.05151 [pdf, other]

doi 10.1148/ryai.230031

Annotation-efficient cancer detection with report-guided lesion annotation for deep learning-based prostate cancer detection in bpMRI

Authors: Joeran S. Bosma, Anindo Saha, Matin Hosseinzadeh, Ilse Slootweg, Maarten de Rooij, Henkjan Huisman

Abstract: Deep learning-based diagnostic performance increases with more annotated data, but large-scale manual annotations are expensive and labour-intensive. Experts evaluate diagnostic images during clinical routine, and write their findings in reports. Leveraging unlabelled exams paired with clinical reports could overcome the manual labelling bottleneck. We hypothesise that detection models can be trai… ▽ More Deep learning-based diagnostic performance increases with more annotated data, but large-scale manual annotations are expensive and labour-intensive. Experts evaluate diagnostic images during clinical routine, and write their findings in reports. Leveraging unlabelled exams paired with clinical reports could overcome the manual labelling bottleneck. We hypothesise that detection models can be trained semi-supervised with automatic annotations generated using model predictions, guided by sparse information from clinical reports. To demonstrate efficacy, we train clinically significant prostate cancer (csPCa) segmentation models, where automatic annotations are guided by the number of clinically significant findings in the radiology reports. We included 7,756 prostate MRI examinations, of which 3,050 were manually annotated. We evaluated prostate cancer detection performance on 300 exams from an external centre with histopathology-confirmed ground truth. Semi-supervised training improved patient-based diagnostic area under the receiver operating characteristic curve from $87.2 \pm 0.8\%$ to $89.4 \pm 1.0\%$ ($P<10^{-4}$) and improved lesion-based sensitivity at one false positive per case from $76.4 \pm 3.8\%$ to $83.6 \pm 2.3\%$ ($P<10^{-4}$). Semi-supervised training was 14$\times$ more annotation-efficient for case-based performance and 6$\times$ more annotation-efficient for lesion-based performance. This improved performance demonstrates the feasibility of our training procedure. Source code is publicly available at github.com/DIAGNijmegen/Report-Guided-Annotation. Best csPCa detection algorithm is available at grand-challenge.org/algorithms/bpmri-cspca-detection-report-guided-annotations/. △ Less

Submitted 19 February, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

Journal ref: Radiology: Artificial Intelligence, 2023:e230031

arXiv:2110.12889 [pdf, other]

Anatomical and Diagnostic Bayesian Segmentation in Prostate MRI $-$Should Different Clinical Objectives Mandate Different Loss Functions?

Authors: Anindo Saha, Joeran Bosma, Jasper Linmans, Matin Hosseinzadeh, Henkjan Huisman

Abstract: We hypothesize that probabilistic voxel-level classification of anatomy and malignancy in prostate MRI, although typically posed as near-identical segmentation tasks via U-Nets, require different loss functions for optimal performance due to inherent differences in their clinical objectives. We investigate distribution, region and boundary-based loss functions for both tasks across 200 patient exa… ▽ More We hypothesize that probabilistic voxel-level classification of anatomy and malignancy in prostate MRI, although typically posed as near-identical segmentation tasks via U-Nets, require different loss functions for optimal performance due to inherent differences in their clinical objectives. We investigate distribution, region and boundary-based loss functions for both tasks across 200 patient exams from the publicly-available ProstateX dataset. For evaluation, we conduct a thorough comparative analysis of model predictions and calibration, measured with respect to multi-class volume segmentation of the prostate anatomy (whole-gland, transitional zone, peripheral zone), as well as, patient-level diagnosis and lesion-level detection of clinically significant prostate cancer. Notably, we find that distribution-based loss functions (in particular, focal loss) are well-suited for diagnostic or panoptic segmentation tasks such as lesion detection, primarily due to their implicit property of inducing better calibration. Meanwhile, (with the exception of focal loss) both distribution and region/boundary-based loss functions perform equally well for anatomical or semantic segmentation tasks, such as quantification of organ shape, size and boundaries. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: Accepted to Medical Imaging Meets NeurIPS Workshop of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

arXiv:2109.01392 [pdf, other]

Finding Colorful Paths in Temporal Graphs

Authors: Riccardo Dondi, Mohammad Mehdi Hosseinzadeh

Abstract: The problem of finding paths in temporal graphs has been recently considered due to its many applications. In this paper we consider a variant of the problem that, given a vertex-colored temporal graph, asks for a path whose vertices have distinct colors and include the maximum number of colors. We study the approximation complexity of the problem and we provide an inapproximability lower bound. T… ▽ More The problem of finding paths in temporal graphs has been recently considered due to its many applications. In this paper we consider a variant of the problem that, given a vertex-colored temporal graph, asks for a path whose vertices have distinct colors and include the maximum number of colors. We study the approximation complexity of the problem and we provide an inapproximability lower bound. Then we present a heuristic for the problem and an experimental evaluation of our heuristic, both on synthetic and real-world graphs. △ Less

Submitted 3 September, 2021; originally announced September 2021.

arXiv:2102.05144 [pdf, ps, other]

Toward Safe and Efficient Human-Robot Interaction via Behavior-Driven Danger Signaling

Authors: Mehdi Hosseinzadeh, Bruno Sinopoli, Aaron F. Bobick

Abstract: This paper introduces the notion of danger awareness in the context of Human-Robot Interaction (HRI), which decodes whether a human is aware of the existence of the robot, and illuminates whether the human is willing to engage in enforcing the safety. This paper also proposes a method to quantify this notion as a single binary variable, so-called danger awareness coefficient. By analyzing the effe… ▽ More This paper introduces the notion of danger awareness in the context of Human-Robot Interaction (HRI), which decodes whether a human is aware of the existence of the robot, and illuminates whether the human is willing to engage in enforcing the safety. This paper also proposes a method to quantify this notion as a single binary variable, so-called danger awareness coefficient. By analyzing the effect of this coefficient on the human's actions, an online Bayesian learning method is proposed to update the belief about the value of the coefficient. It is shown that based upon the danger awareness coefficient and the proposed learning method, the robot can build a predictive human model to anticipate the human's future actions. In order to create a communication channel between the human and the robot, to enrich the observations and get informative data about the human, and to improve the efficiency of the robot, the robot is equipped with a danger signaling system. A predictive planning scheme, coupled with the predictive human model, is also proposed to provide an efficient and Probabilistically safe plan for the robot. The effectiveness of the proposed scheme is demonstrated through simulation studies on an interaction between a self-driving car and a pedestrian. △ Less

Submitted 10 February, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

arXiv:2101.03244 [pdf, other]

doi 10.1016/j.media.2021.102155

End-to-end Prostate Cancer Detection in bpMRI via 3D CNNs: Effects of Attention Mechanisms, Clinical Priori and Decoupled False Positive Reduction

Authors: Anindo Saha, Matin Hosseinzadeh, Henkjan Huisman

Abstract: We present a multi-stage 3D computer-aided detection and diagnosis (CAD) model for automated localization of clinically significant prostate cancer (csPCa) in bi-parametric MR imaging (bpMRI). Deep attention mechanisms drive its detection network, targeting salient structures and highly discriminative feature dimensions across multiple resolutions. Its goal is to accurately identify csPCa lesions… ▽ More We present a multi-stage 3D computer-aided detection and diagnosis (CAD) model for automated localization of clinically significant prostate cancer (csPCa) in bi-parametric MR imaging (bpMRI). Deep attention mechanisms drive its detection network, targeting salient structures and highly discriminative feature dimensions across multiple resolutions. Its goal is to accurately identify csPCa lesions from indolent cancer and the wide range of benign pathology that can afflict the prostate gland. Simultaneously, a decoupled residual classifier is used to achieve consistent false positive reduction, without sacrificing high sensitivity or computational efficiency. In order to guide model generalization with domain-specific clinical knowledge, a probabilistic anatomical prior is used to encode the spatial prevalence and zonal distinction of csPCa. Using a large dataset of 1950 prostate bpMRI paired with radiologically-estimated annotations, we hypothesize that such CNN-based models can be trained to detect biopsy-confirmed malignancies in an independent cohort. For 486 institutional testing scans, the 3D CAD system achieves 83.69$\pm$5.22% and 93.19$\pm$2.96% detection sensitivity at 0.50 and 1.46 false positive(s) per patient, respectively, with 0.882$\pm$0.030 AUROC in patient-based diagnosis $-$significantly outperforming four state-of-the-art baseline architectures (U-SEResNet, UNet++, nnU-Net, Attention U-Net) from recent literature. For 296 external biopsy-confirmed testing scans, the ensembled CAD system shares moderate agreement with a consensus of expert radiologists (76.69%; $kappa$ $=$ 0.51$\pm$0.04) and independent pathologists (81.08%; $kappa$ $=$ 0.56$\pm$0.06); demonstrating strong generalization to histologically-confirmed csPCa diagnosis. △ Less

Submitted 30 June, 2021; v1 submitted 8 January, 2021; originally announced January 2021.

Comments: Accepted to MedIA: Medical Image Analysis. This manuscript incorporates and expands upon our 2020 Medical Imaging Meets NeurIPS Workshop paper (arXiv:2011.00263)

arXiv:2011.08381 [pdf, other]

Optimal Accuracy-Time Trade-off for Deep Learning Services in Edge Computing Systems

Authors: Minoo Hosseinzadeh, Andrew Wachal, Hana Khamfroush, Daniel E. Lucani

Abstract: With the increasing demand for computationally intensive services like deep learning tasks, emerging distributed computing platforms such as edge computing (EC) systems are becoming more popular. Edge computing systems have shown promising results in terms of latency reduction compared to the traditional cloud systems. However, their limited processing capacity imposes a trade-off between the pote… ▽ More With the increasing demand for computationally intensive services like deep learning tasks, emerging distributed computing platforms such as edge computing (EC) systems are becoming more popular. Edge computing systems have shown promising results in terms of latency reduction compared to the traditional cloud systems. However, their limited processing capacity imposes a trade-off between the potential latency reduction and the achieved accuracy in computationally-intensive services such as deep learning-based services. In this paper, we focus on finding the optimal accuracy-time trade-off for running deep learning services in a three-tier EC platform where several deep learning models with different accuracy levels are available. Specifically, we cast the problem as an Integer Linear Program, where optimal task scheduling decisions are made to maximize overall user satisfaction in terms of accuracy-time trade-off. We prove that our problem is NP-hard and then provide a polynomial constant-time greedy algorithm, called GUS, that is shown to attain near-optimal results. Finally, upon vetting our algorithmic solution through numerical experiments and comparison with a set of heuristics, we deploy it on a test-bed implemented to measure for real-world results. The results of both numerical analysis and real-world implementation show that GUS can outperform the baseline heuristics in terms of the average percentage of satisfied users by a factor of at least 50%. △ Less

Submitted 16 November, 2020; originally announced November 2020.

arXiv:2011.00263 [pdf, other]

Encoding Clinical Priori in 3D Convolutional Neural Networks for Prostate Cancer Detection in bpMRI

Authors: Anindo Saha, Matin Hosseinzadeh, Henkjan Huisman

Abstract: We hypothesize that anatomical priors can be viable mediums to infuse domain-specific clinical knowledge into state-of-the-art convolutional neural networks (CNN) based on the U-Net architecture. We introduce a probabilistic population prior which captures the spatial prevalence and zonal distinction of clinically significant prostate cancer (csPCa), in order to improve its computer-aided detectio… ▽ More We hypothesize that anatomical priors can be viable mediums to infuse domain-specific clinical knowledge into state-of-the-art convolutional neural networks (CNN) based on the U-Net architecture. We introduce a probabilistic population prior which captures the spatial prevalence and zonal distinction of clinically significant prostate cancer (csPCa), in order to improve its computer-aided detection (CAD) in bi-parametric MR imaging (bpMRI). To evaluate performance, we train 3D adaptations of the U-Net, U-SEResNet, UNet++ and Attention U-Net using 800 institutional training-validation scans, paired with radiologically-estimated annotations and our computed prior. For 200 independent testing bpMRI scans with histologically-confirmed delineations of csPCa, our proposed method of encoding clinical priori demonstrates a strong ability to improve patient-based diagnosis (upto 8.70% increase in AUROC) and lesion-level detection (average increase of 1.08 pAUC between 0.1-10 false positives per patient) across all four architectures. △ Less

Submitted 21 September, 2021; v1 submitted 31 October, 2020; originally announced November 2020.

Comments: Accepted to Medical Imaging Meets NeurIPS Workshop of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

arXiv:2008.01573 [pdf, other]

Top-k Connected Overlapping Densest Subgraphs in Dual Networks

Authors: Riccardo Dondi, Pietro Hiram Guzzi, Mohammad Mehdi Hosseinzadeh

Abstract: Networks are largely used for modelling and analysing data and relations among them. Recently, it has been shown that the use of a single network may not be the optimal choice, since a single network may misses some aspects. Consequently, it has been proposed to use a pair of networks to better model all the aspects, and the main approach is referred to as dual networks (DNs). DNs are two related… ▽ More Networks are largely used for modelling and analysing data and relations among them. Recently, it has been shown that the use of a single network may not be the optimal choice, since a single network may misses some aspects. Consequently, it has been proposed to use a pair of networks to better model all the aspects, and the main approach is referred to as dual networks (DNs). DNs are two related graphs (one weighted, the other unweighted) that share the same set of vertices and two different edge sets. In DNs is often interesting to extract common subgraphs among the two networks that are maximally dense in the conceptual network and connected in the physical one. The simplest instance of this problem is finding a common densest connected subgraph (DCS), while we here focus on the detection of the Top-k Densest Connected subgraphs, i.e. a set k subgraphs having the largest density in the conceptual network which are also connected in the physical network. We formalise the problem and then we propose a heuristic to find a solution, since the problem is computationally hard. A set of experiments on synthetic and real networks is also presented to support our approach. △ Less

Submitted 4 August, 2020; originally announced August 2020.

arXiv:2005.03059 [pdf]

CovidCTNet: An Open-Source Deep Learning Approach to Identify Covid-19 Using CT Image

Authors: Tahereh Javaheri, Morteza Homayounfar, Zohreh Amoozgar, Reza Reiazi, Fatemeh Homayounieh, Engy Abbas, Azadeh Laali, Amir Reza Radmard, Mohammad Hadi Gharib, Seyed Ali Javad Mousavi, Omid Ghaemi, Rosa Babaei, Hadi Karimi Mobin, Mehdi Hosseinzadeh, Rana Jahanban-Esfahlan, Khaled Seidi, Mannudeep K. Kalra, Guanglan Zhang, L. T. Chitkushev, Benjamin Haibe-Kains, Reza Malekzadeh, Reza Rawassizadeh

Abstract: Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, it… ▽ More Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, its accuracy in detection is only ~70-75%. Another approved strategy is computed tomography (CT) imaging. CT imaging has a much higher sensitivity of ~80-98%, but similar accuracy of 70%. To enhance the accuracy of CT imaging detection, we developed an open-source set of algorithms called CovidCTNet that successfully differentiates Covid-19 from community-acquired pneumonia (CAP) and other lung diseases. CovidCTNet increases the accuracy of CT imaging detection to 90% compared to radiologists (70%). The model is designed to work with heterogeneous and small sample sizes independent of the CT imaging hardware. In order to facilitate the detection of Covid-19 globally and assist radiologists and physicians in the screening process, we are releasing all algorithms and parametric details in an open-source format. Open-source sharing of our CovidCTNet enables developers to rapidly improve and optimize services, while preserving user privacy and data ownership. △ Less

Submitted 15 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 5 figures

arXiv:2001.06479 [pdf, other]

Unsupervised Learning of Camera Pose with Compositional Re-estimation

Authors: Seyed Shahabeddin Nabavi, Mehrdad Hosseinzadeh, Ramin Fahimi, Yang Wang

Abstract: We consider the problem of unsupervised camera pose estimation. Given an input video sequence, our goal is to estimate the camera pose (i.e. the camera motion) between consecutive frames. Traditionally, this problem is tackled by placing strict constraints on the transformation vector or by incorporating optical flow through a complex pipeline. We propose an alternative approach that utilizes a co… ▽ More We consider the problem of unsupervised camera pose estimation. Given an input video sequence, our goal is to estimate the camera pose (i.e. the camera motion) between consecutive frames. Traditionally, this problem is tackled by placing strict constraints on the transformation vector or by incorporating optical flow through a complex pipeline. We propose an alternative approach that utilizes a compositional re-estimation process for camera pose estimation. Given an input, we first estimate a depth map. Our method then iteratively estimates the camera motion based on the estimated depth map. Our approach significantly improves the predicted camera motion both quantitatively and visually. Furthermore, the re-estimation resolves the problem of out-of-boundaries pixels in a novel and simple way. Another advantage of our approach is that it is adaptable to other camera pose estimation approaches. Experimental analysis on KITTI benchmark dataset demonstrates that our method outperforms existing state-of-the-art approaches in unsupervised camera ego-motion estimation. △ Less

Submitted 17 January, 2020; originally announced January 2020.

Comments: Accepted to WACV 2020

arXiv:1909.08245 [pdf, other]

Towards Shape Biased Unsupervised Representation Learning for Domain Generalization

Authors: Nader Asadi, Amir M. Sarfi, Mehrdad Hosseinzadeh, Zahra Karimpour, Mahdi Eftekhari

Abstract: It is known that, without awareness of the process, our brain appears to focus on the general shape of objects rather than superficial statistics of context. On the other hand, learning autonomously allows discovering invariant regularities which help generalization. In this work, we propose a learning framework to improve the shape bias property of self-supervised methods. Our method learns seman… ▽ More It is known that, without awareness of the process, our brain appears to focus on the general shape of objects rather than superficial statistics of context. On the other hand, learning autonomously allows discovering invariant regularities which help generalization. In this work, we propose a learning framework to improve the shape bias property of self-supervised methods. Our method learns semantic and shape biased representations by integrating domain diversification and jigsaw puzzles. The first module enables the model to create a dynamic environment across arbitrary domains and provides a domain exploration vs. exploitation trade-off, while the second module allows the model to explore this environment autonomously. This universal framework does not require prior knowledge of the domain of interest. Extensive experiments are conducted on several domain generalization datasets, namely, PACS, Office-Home, VLCS, and Digits. We show that our framework outperforms state-of-the-art domain generalization methods by a large margin. △ Less

Submitted 29 March, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

Comments: Under review

arXiv:1907.01023 [pdf, other]

Diminishing the Effect of Adversarial Perturbations via Refining Feature Representation

Authors: Nader Asadi, AmirMohammad Sarfi, Mehrdad Hosseinzadeh, Sahba Tahsini, Mahdi Eftekhari

Abstract: Deep neural networks are highly vulnerable to adversarial examples, which imposes severe security issues for these state-of-the-art models. Many defense methods have been proposed to mitigate this problem. However, a lot of them depend on modification or additional training of the target model. In this work, we analytically investigate each layer's representation of non-perturbed and perturbed ima… ▽ More Deep neural networks are highly vulnerable to adversarial examples, which imposes severe security issues for these state-of-the-art models. Many defense methods have been proposed to mitigate this problem. However, a lot of them depend on modification or additional training of the target model. In this work, we analytically investigate each layer's representation of non-perturbed and perturbed images and show the effect of perturbations on each of these representations. Accordingly, a method based on whitening coloring transform is proposed in order to diminish the misrepresentation of any desirable layer caused by adversaries. Our method can be applied to any layer of any arbitrary model without the need of any modification or additional training. Due to the fact that the full whitening of the layer's representation is not easily differentiable, our proposed method is superbly robust against white-box attacks. Furthermore, we demonstrate the strength of our method against some state-of-the-art black-box attacks. △ Less

Submitted 1 October, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

Comments: Accepted at NeuralIPS 2019 workshop on Safety and Robustness in Decision Making

arXiv:1903.02025 [pdf, other]

Crowd Counting Using Scale-Aware Attention Networks

Authors: Mohammad Asiful Hossain, Mehrdad Hosseinzadeh, Omit Chanda, Yang Wang

Abstract: In this paper, we consider the problem of crowd counting in images. Given an image of a crowded scene, our goal is to estimate the density map of this image, where each pixel value in the density map corresponds to the crowd density at the corresponding location in the image. Given the estimated density map, the final crowd count can be obtained by summing over all values in the density map. One c… ▽ More In this paper, we consider the problem of crowd counting in images. Given an image of a crowded scene, our goal is to estimate the density map of this image, where each pixel value in the density map corresponds to the crowd density at the corresponding location in the image. Given the estimated density map, the final crowd count can be obtained by summing over all values in the density map. One challenge of crowd counting is the scale variation in images. In this work, we propose a novel scale-aware attention network to address this challenge. Using the attention mechanism popular in recent deep learning architectures, our model can automatically focus on certain global and local scales appropriate for the image. By combining these global and local scale attention, our model outperforms other state-of-the-art methods for crowd counting on several benchmark datasets. △ Less

Submitted 5 March, 2019; originally announced March 2019.

arXiv:1809.09149 [pdf, other]

Real-Time Monocular Object-Model Aware Sparse SLAM

Authors: Mehdi Hosseinzadeh, Kejie Li, Yasir Latif, Ian Reid

Abstract: Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While sparse point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work incorporates a real-time deep-learn… ▽ More Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While sparse point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work incorporates a real-time deep-learned object detector to the monocular SLAM framework for representing generic objects as quadrics that permit detections to be seamlessly integrated while allowing the real-time performance. Finer reconstruction of an object, learned by a CNN network, is also incorporated and provides a shape prior for the quadric leading further refinement. To capture the dominant structure of the scene, additional planar landmarks are detected by a CNN-based plane detector and modeled as independent landmarks in the map. Extensive experiments support our proposed inclusion of semantic objects and planar structures directly in the bundle-adjustment of SLAM - Semantic SLAM - that enriches the reconstructed map semantically, while significantly improving the camera localization. The performance of our SLAM system is demonstrated in https://youtu.be/UMWXd4sHONw and https://youtu.be/QPQqVrvP0dE . △ Less

Submitted 6 March, 2019; v1 submitted 24 September, 2018; originally announced September 2018.

Comments: Accepted to ICRA 2019 (for video demo look at https://youtu.be/UMWXd4sHONw and https://youtu.be/QPQqVrvP0dE)

arXiv:1809.02434 [pdf, other]

Top-k Overlapping Densest Subgraphs: Approximation and Complexity

Authors: Riccardo Dondi, Mohammad Mehdi Hosseinzadeh, Giancarlo Mauri, Italo Zoppis

Abstract: A central problem in graph mining is finding dense subgraphs, with several applications in different fields, a notable example being identifying communities. While a lot of effort has been put on the problem of finding a single dense subgraph, only recently the focus has been shifted to the problem of finding a set of densest subgraphs. Some approaches aim at finding disjoint subgraphs, while in m… ▽ More A central problem in graph mining is finding dense subgraphs, with several applications in different fields, a notable example being identifying communities. While a lot of effort has been put on the problem of finding a single dense subgraph, only recently the focus has been shifted to the problem of finding a set of densest subgraphs. Some approaches aim at finding disjoint subgraphs, while in many real-world networks communities are often overlapping. An approach introduced to find possible overlapping subgraphs is the Top-k Overlapping Densest Subgraphs problem. For a given integer k >= 1, the goal of this problem is to find a set of k densest subgraphs that may share some vertices. The objective function to be maximized takes into account both the density of the subgraphs and the distance between subgraphs in the solution. The Top-k Overlapping Densest Subgraphs problem has been shown to admit a 1/10-factor approximation algorithm. Furthermore, the computational complexity of the problem has been left open. In this paper, we present contributions concerning the approximability and the computational complexity of the problem. For the approximability, we present approximation algorithms that improves the approximation factor to 1/2 , when k is bounded by the vertex set, and to 2/3 when k is a constant. For the computational complexity, we show that the problem is NP-hard even when k = 3. △ Less

Submitted 30 January, 2019; v1 submitted 7 September, 2018; originally announced September 2018.

arXiv:1804.09111 [pdf, other]

Structure Aware SLAM using Quadrics and Planes

Authors: Mehdi Hosseinzadeh, Yasir Latif, Trung Pham, Niko Suenderhauf, Ian Reid

Abstract: Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work marries the two and proposes a method for… ▽ More Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work marries the two and proposes a method for representing generic objects as quadrics which allows object detections to be seamlessly integrated in a SLAM framework. For scene coverage, additional dominant planar structures are modeled as infinite planes. Experiments show that the proposed points-planes-quadrics representation can easily incorporate Manhattan and object affordance constraints, greatly improving camera localization and leading to semantically meaningful maps. The performance of our SLAM system is demonstrated in https://youtu.be/dR-rB9keF8M . △ Less

Submitted 2 November, 2018; v1 submitted 24 April, 2018; originally announced April 2018.

Comments: Accepted to ACCV 2018

Showing 1–28 of 28 results for author: HosseinZadeh, M