-
Generalizing AI-driven Assessment of Immunohistochemistry across Immunostains and Cancer Types: A Universal Immunohistochemistry Analyzer
Authors:
Biagio Brattoli,
Mohammad Mostafavi,
Taebum Lee,
Wonkyung Jung,
Jeongun Ryu,
Seonwook Park,
Jongchan Park,
Sergio Pereira,
Seunghwan Shin,
Sangjoon Choi,
Hyojin Kim,
Donggeun Yoo,
Siraj M. Ali,
Kyunghyun Paeng,
Chan-Young Ock,
Soo Ick Cho,
Seokhwi Kim
Abstract:
Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We dev…
▽ More
Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We developed a Universal IHC (UIHC) analyzer, an AI model for interpreting IHC images regardless of tumor or IHC types, using training datasets from various cancers stained for PD-L1 and/or HER2. This multi-cohort trained model outperforms conventional single-cohort models in interpreting unseen IHCs (Kappa score 0.578 vs. up to 0.509) and consistently shows superior performance across different positive staining cutoff values. Qualitative analysis reveals that UIHC effectively clusters patches based on expression levels. The UIHC model also quantitatively assesses c-MET expression with MET mutations, representing a significant advancement in AI application in the era of personalized medicine and accumulating novel biomarkers.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
On the Need for Configurable Travel Recommender Systems: A Systematic Mapping Study
Authors:
Rickson Simioni Pereira,
Claudio Di Sipio,
Martina De Sanctis,
Ludovico Iovino
Abstract:
Travel Recommender Systems TRSs have been proposed to ease the burden of choice in the travel domain by providing valuable suggestions based on user preferences Despite the broad similarities in functionalities and data provided by TRSs these systems are significantly influenced by the diverse and heterogeneous contexts in which they operate This plays a crucial role in determining the accuracy an…
▽ More
Travel Recommender Systems TRSs have been proposed to ease the burden of choice in the travel domain by providing valuable suggestions based on user preferences Despite the broad similarities in functionalities and data provided by TRSs these systems are significantly influenced by the diverse and heterogeneous contexts in which they operate This plays a crucial role in determining the accuracy and appropriateness of the travel recommendations they deliver For instance in contexts like smart cities and natural parks diverse runtime informationsuch as traffic conditions and trail status respectivelyshould be utilized to ensure the delivery of pertinent recommendations aligned with user preferences within the specific context However there is a trend to build TRSs from scratch for different contexts rather than supporting developers with configuration approaches that promote reuse minimize errors and accelerate timetomarket To illustrate this gap in this paper we conduct a systematic mapping study to examine the extent to which existing TRSs are configurable for different contexts The conducted analysis reveals the lack of configuration support assisting TRSs providers in developing TRSs closely tied to their operational context Our findings shed light on uncovered challenges in the domain thus fostering future research focused on providing new methodologies enabling providers to handle TRSs configurations
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Competing for pixels: a self-play algorithm for weakly-supervised segmentation
Authors:
Shaheer U. Saeed,
Shiqi Huang,
João Ramalhinho,
Iani J. M. B. Gayo,
Nina Montaña-Brown,
Ester Bonmati,
Stephen P. Pereira,
Brian Davidson,
Dean C. Barratt,
Matthew J. Clarkson,
Yipeng Hu
Abstract:
Weakly-supervised segmentation (WSS) methods, reliant on image-level labels indicating object presence, lack explicit correspondence between labels and regions of interest (ROIs), posing a significant challenge. Despite this, WSS methods have attracted attention due to their much lower annotation costs compared to fully-supervised segmentation. Leveraging reinforcement learning (RL) self-play, we…
▽ More
Weakly-supervised segmentation (WSS) methods, reliant on image-level labels indicating object presence, lack explicit correspondence between labels and regions of interest (ROIs), posing a significant challenge. Despite this, WSS methods have attracted attention due to their much lower annotation costs compared to fully-supervised segmentation. Leveraging reinforcement learning (RL) self-play, we propose a novel WSS method that gamifies image segmentation of a ROI. We formulate segmentation as a competition between two agents that compete to select ROI-containing patches until exhaustion of all such patches. The score at each time-step, used to compute the reward for agent training, represents likelihood of object presence within the selection, determined by an object presence detector pre-trained using only image-level binary classification labels of object presence. Additionally, we propose a game termination condition that can be called by either side upon exhaustion of all ROI-containing patches, followed by the selection of a final patch from each. Upon termination, the agent is incentivised if ROI-containing patches are exhausted or disincentivised if an ROI-containing patch is found by the competitor. This competitive setup ensures minimisation of over- or under-segmentation, a common problem with WSS methods. Extensive experimentation across four datasets demonstrates significant performance improvements over recent state-of-the-art methods. Code: https://github.com/s-sd/spurl/tree/main/wss
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1110 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 8 August, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
A MIL Approach for Anomaly Detection in Surveillance Videos from Multiple Camera Views
Authors:
Silas Santiago Lopes Pereira,
José Everardo Bessa Maia
Abstract:
Occlusion and clutter are two scene states that make it difficult to detect anomalies in surveillance video. Furthermore, anomaly events are rare and, as a consequence, class imbalance and lack of labeled anomaly data are also key features of this task. Therefore, weakly supervised methods are heavily researched for this application. In this paper, we tackle these typical problems of anomaly detec…
▽ More
Occlusion and clutter are two scene states that make it difficult to detect anomalies in surveillance video. Furthermore, anomaly events are rare and, as a consequence, class imbalance and lack of labeled anomaly data are also key features of this task. Therefore, weakly supervised methods are heavily researched for this application. In this paper, we tackle these typical problems of anomaly detection in surveillance video by combining Multiple Instance Learning (MIL) to deal with the lack of labels and Multiple Camera Views (MC) to reduce occlusion and clutter effects. In the resulting MC-MIL algorithm we apply a multiple camera combined loss function to train a regression network with Sultani's MIL ranking function. To evaluate the MC-MIL algorithm first proposed here, the multiple camera PETS-2009 benchmark dataset was re-labeled for the anomaly detection task from multiple camera views. The result shows a significant performance improvement in F1 score compared to the single-camera configuration.
△ Less
Submitted 12 November, 2023; v1 submitted 2 July, 2023;
originally announced July 2023.
-
OCELOT: Overlapped Cell on Tissue Dataset for Histopathology
Authors:
Jeongun Ryu,
Aaron Valero Puche,
JaeWoong Shin,
Seonwook Park,
Biagio Brattoli,
Jinhee Lee,
Wonkyung Jung,
Soo Ick Cho,
Kyunghyun Paeng,
Chan-Young Ock,
Donggeun Yoo,
Sérgio Pereira
Abstract:
Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by…
▽ More
Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by pathologists in the cell detection models, mainly due to the lack of datasets containing both cell and tissue annotations with overlapping regions. To overcome this limitation, we propose and publicly release OCELOT, a dataset purposely dedicated to the study of cell-tissue relationships for cell detection in histopathology. OCELOT provides overlapping cell and tissue annotations on images acquired from multiple organs. Within this setting, we also propose multi-task learning approaches that benefit from learning both cell and tissue tasks simultaneously. When compared against a model trained only for the cell detection task, our proposed approaches improve cell detection performance on 3 datasets: proposed OCELOT, public TIGER, and internal CARP datasets. On the OCELOT test set in particular, we show up to 6.79 improvement in F1-score. We believe the contributions of this paper, including the release of the OCELOT dataset at https://lunit-io.github.io/research/publications/ocelot are a crucial starting point toward the important research direction of incorporating cell-tissue relationships in computation pathology.
△ Less
Submitted 23 March, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Benchmarking Self-Supervised Learning on Diverse Pathology Datasets
Authors:
Mingu Kang,
Heon Song,
Seonwook Park,
Donggeun Yoo,
Sérgio Pereira
Abstract:
Computational pathology can lead to saving human lives, but models are annotation hungry and pathology images are notoriously expensive to annotate. Self-supervised learning has shown to be an effective method for utilizing unlabeled data, and its application to pathology could greatly benefit its downstream tasks. Yet, there are no principled studies that compare SSL methods and discuss how to ad…
▽ More
Computational pathology can lead to saving human lives, but models are annotation hungry and pathology images are notoriously expensive to annotate. Self-supervised learning has shown to be an effective method for utilizing unlabeled data, and its application to pathology could greatly benefit its downstream tasks. Yet, there are no principled studies that compare SSL methods and discuss how to adapt them for pathology. To address this need, we execute the largest-scale study of SSL pre-training on pathology image data, to date. Our study is conducted using 4 representative SSL methods on diverse downstream tasks. We establish that large-scale domain-aligned pre-training in pathology consistently out-performs ImageNet pre-training in standard SSL settings such as linear and fine-tuning evaluations, as well as in low-label regimes. Moreover, we propose a set of domain-specific techniques that we experimentally show leads to a performance boost. Lastly, for the first time, we apply SSL to the challenging task of nuclei instance segmentation and show large and consistent performance improvements under diverse settings.
△ Less
Submitted 18 April, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Active learning using adaptable task-based prioritisation
Authors:
Shaheer U. Saeed,
João Ramalhinho,
Mark Pinnock,
Ziyi Shen,
Yunguan Fu,
Nina Montaña-Brown,
Ester Bonmati,
Dean C. Barratt,
Stephen P. Pereira,
Brian Davidson,
Matthew J. Clarkson,
Yipeng Hu
Abstract:
Supervised machine learning-based medical image computing applications necessitate expert label curation, while unlabelled image data might be relatively abundant. Active learning methods aim to prioritise a subset of available image data for expert annotation, for label-efficient model training. We develop a controller neural network that measures priority of images in a sequence of batches, as i…
▽ More
Supervised machine learning-based medical image computing applications necessitate expert label curation, while unlabelled image data might be relatively abundant. Active learning methods aim to prioritise a subset of available image data for expert annotation, for label-efficient model training. We develop a controller neural network that measures priority of images in a sequence of batches, as in batch-mode active learning, for multi-class segmentation tasks. The controller is optimised by rewarding positive task-specific performance gain, within a Markov decision process (MDP) environment that also optimises the task predictor. In this work, the task predictor is a segmentation network. A meta-reinforcement learning algorithm is proposed with multiple MDPs, such that the pre-trained controller can be adapted to a new MDP that contains data from different institutes and/or requires segmentation of different organs or structures within the abdomen. We present experimental results using multiple CT datasets from more than one thousand patients, with segmentation tasks of nine different abdominal organs, to demonstrate the efficacy of the learnt prioritisation controller function and its cross-institute and cross-organ adaptability. We show that the proposed adaptable prioritisation metric yields converging segmentation accuracy for the novel class of kidney, unseen in training, using between approximately 40\% to 60\% of labels otherwise required with other heuristic or random prioritisation metrics. For clinical datasets of limited size, the proposed adaptable prioritisation offers a performance improvement of 22.6\% and 10.2\% in Dice score, for tasks of kidney and liver vessel segmentation, respectively, compared to random prioritisation and alternative active sampling strategies.
△ Less
Submitted 3 December, 2022;
originally announced December 2022.
-
Robótica Móvel e Inteligência Artificial para Investigação, Competição e Automatização de Sistemas Industriais
Authors:
Hiago Jacobs Sodre Pereira,
Pablo Ezequiel Moraes,
André Da Silva Kelbouscas,
Ricardo Grando
Abstract:
The implementation of robots to enhance some processes has become popular in recent years due to the accelerated way of production in some factories. Within this context was where robotics has emerged, firstly with stationary robots and more recently mobile robots, namely aerial and terrestrial robots. They can be used for delimited processes within a function, mainly the stationary robots, but al…
▽ More
The implementation of robots to enhance some processes has become popular in recent years due to the accelerated way of production in some factories. Within this context was where robotics has emerged, firstly with stationary robots and more recently mobile robots, namely aerial and terrestrial robots. They can be used for delimited processes within a function, mainly the stationary robots, but also for research in wider areas and even competition. This work summarizes the construction of a model of terrestrial mobile robot that makes the use of artificial intelligence for the purpose of research and competitions, all of that with the basic sensing that can be used in industry.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Variability Matters : Evaluating inter-rater variability in histopathology for robust cell detection
Authors:
Cholmin Kang,
Chunggi Lee,
Heon Song,
Minuk Ma,
S ergio Pereira
Abstract:
Large annotated datasets have been a key component in the success of deep learning. However, annotating medical images is challenging as it requires expertise and a large budget. In particular, annotating different types of cells in histopathology suffer from high inter- and intra-rater variability due to the ambiguity of the task. Under this setting, the relation between annotators' variability a…
▽ More
Large annotated datasets have been a key component in the success of deep learning. However, annotating medical images is challenging as it requires expertise and a large budget. In particular, annotating different types of cells in histopathology suffer from high inter- and intra-rater variability due to the ambiguity of the task. Under this setting, the relation between annotators' variability and model performance has received little attention. We present a large-scale study on the variability of cell annotations among 120 board-certified pathologists and how it affects the performance of a deep learning model. We propose a method to measure such variability, and by excluding those annotators with low variability, we verify the trade-off between the amount of data and its quality. We found that naively increasing the data size at the expense of inter-rater variability does not necessarily lead to better-performing models in cell detection. Instead, decreasing the inter-rater variability with the expense of decreasing dataset size increased the model performance. Furthermore, models trained from data annotated with lower inter-labeler variability outperform those from higher inter-labeler variability. These findings suggest that the evaluation of the annotators may help tackle the fundamental budget issues in the histopathology domain
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Bao-Enclave: Virtualization-based Enclaves for Arm
Authors:
Samuel Pereira,
Joao Sousa,
Sandro Pinto,
José Martins,
David Cerdeira
Abstract:
General-purpose operating systems (GPOS), such as Linux, encompass several million lines of code. Statistically, a larger code base inevitably leads to a higher number of potential vulnerabilities and inherently a more vulnerable system. To minimize the impact of vulnerabilities in GPOS, it has become common to implement security-sensitive programs outside the domain of the GPOS, i.e., in a Truste…
▽ More
General-purpose operating systems (GPOS), such as Linux, encompass several million lines of code. Statistically, a larger code base inevitably leads to a higher number of potential vulnerabilities and inherently a more vulnerable system. To minimize the impact of vulnerabilities in GPOS, it has become common to implement security-sensitive programs outside the domain of the GPOS, i.e., in a Trusted Execution Environment (TEE). Arm TrustZone is the de-facto technology for implementing TEEs in Arm devices. However, over the last decade, TEEs have been successfully attacked hundreds of times. Unfortunately, these attacks have been possible due to the presence of several architectural and implementation flaws in TrustZone-based TEEs. In this paper, we propose Bao-Enclave, a virtualization-based solution that enables OEMs to remove security functionality from the TEE and move them into normal world isolated environments, protected from potentially malicious OSes, in the form of lightweight virtual machines (VMs). We evaluate Bao-Enclave on real hardware platforms and find out that Bao-Enclave may improve the performance of security-sensitive workloads by up to 4.8x, while significantly simplifying the TEE software TCB.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Interactive Multi-Class Tiny-Object Detection
Authors:
Chunggi Lee,
Seonwook Park,
Heon Song,
Jeongun Ryu,
Sanghoon Kim,
Haejoon Kim,
Sérgio Pereira,
Donggeun Yoo
Abstract:
Annotating tens or hundreds of tiny objects in a given image is laborious yet crucial for a multitude of Computer Vision tasks. Such imagery typically contains objects from various categories, yet the multi-class interactive annotation setting for the detection task has thus far been unexplored. To address these needs, we propose a novel interactive annotation method for multiple instances of tiny…
▽ More
Annotating tens or hundreds of tiny objects in a given image is laborious yet crucial for a multitude of Computer Vision tasks. Such imagery typically contains objects from various categories, yet the multi-class interactive annotation setting for the detection task has thus far been unexplored. To address these needs, we propose a novel interactive annotation method for multiple instances of tiny objects from multiple classes, based on a few point-based user inputs. Our approach, C3Det, relates the full image context with annotator inputs in a local and global manner via late-fusion and feature-correlation, respectively. We perform experiments on the Tiny-DOTA and LCell datasets using both two-stage and one-stage object detection architectures to verify the efficacy of our approach. Our approach outperforms existing approaches in interactive annotation, achieving higher mAP with fewer clicks. Furthermore, we validate the annotation efficiency of our approach in a user study where it is shown to be 2.85x faster and yield only 0.36x task load (NASA-TLX, lower is better) compared to manual annotation. The code is available at https://github.com/ChungYi347/Interactive-Multi-Class-Tiny-Object-Detection.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
A Framework for Controlling Multi-Robot Systems Using Bayesian Optimization and Linear Combination of Vectors
Authors:
Stephen Jacobs,
R. Michael Butts,
Yu Gu,
Ali Baheri,
Guilherme A. S. Pereira
Abstract:
We propose a general framework for creating parameterized control schemes for decentralized multi-robot systems. A variety of tasks can be seen in the decentralized multi-robot literature, each with many possible control schemes. For several of them, the agents choose control velocities using algorithms that extract information from the environment and combine that information in meaningful ways.…
▽ More
We propose a general framework for creating parameterized control schemes for decentralized multi-robot systems. A variety of tasks can be seen in the decentralized multi-robot literature, each with many possible control schemes. For several of them, the agents choose control velocities using algorithms that extract information from the environment and combine that information in meaningful ways. From this basic formation, a framework is proposed that classifies each robots' measurement information as sets of relevant scalars and vectors and creates a linear combination of the measured vector sets. Along with an optimizable parameter set, the scalar measurements are used to generate the coefficients for the linear combination. With this framework and Bayesian optimization, we can create effective control systems for several multi-robot tasks, including cohesion and segregation, pattern formation, and searching/foraging.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
CADV: A software visualization approach for code annotations distribution
Authors:
Phyllipe Lima,
Jorge Melegati,
Everaldo Gomes,
Nathalya Stefhany Pereira,
Eduardo Guerra,
Paulo Meirelles
Abstract:
Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this context, software visualization has been studied and researched to improve program comprehension in different aspects. This study aimed at designing a software visuali…
▽ More
Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this context, software visualization has been studied and researched to improve program comprehension in different aspects. This study aimed at designing a software visualization approach that graphically displays how code annotations are distributed and organized in a software system and developing a tool, as a reference implementation of the approach, to generate views and interact with users. We conducted an empirical evaluation through questionnaires and interviews to evaluate our visualization approach considering four aspects: effectiveness for program comprehension, perceived usefulness, perceived ease of use, and suitability for the intended audience. The resulting data was used to perform a qualitative and quantitative analysis. The tool identifies package responsibilities providing visual information about their annotations at different levels. Using the developed tool, the participants achieved a high correctness rate in the program comprehension tasks and performed very well in questions about the overview of the system under analysis. Finally, participants perceived that the tool outperforms existing approaches for code inspection when searching for information related to code annotations. The results show that the visualization approach using the developed tool is effective in program comprehension tasks related to code annotations, which can also be used to identify responsibilities in the application packages. Moreover, it was evaluated as suitable for newcomers to overview the usage of annotations in the system and for architects to perform a deep analysis that can potentially detect misplaced annotations and abnormal growths on their usage.
△ Less
Submitted 27 June, 2022; v1 submitted 20 December, 2021;
originally announced December 2021.
-
Faster Content Delivery using RSU Caching and Vehicular Pre-caching in Vehicular Networks
Authors:
R. S. Pereira,
L. Guan,
M. Ye,
Z. Zhang
Abstract:
Most non-safety applications deployed in Vehicular Ad-hoc Network (VANET) use vehicle-to-infrastructure (V2I) and I2V communications to receive various forms of content such as periodic traffic updates, advertisements from adjacent road-side units (RSUs). In case of heavy traffic on highways and urban areas, content delivery time (CDT) can be significantly affected. Increase in CDT can be attribut…
▽ More
Most non-safety applications deployed in Vehicular Ad-hoc Network (VANET) use vehicle-to-infrastructure (V2I) and I2V communications to receive various forms of content such as periodic traffic updates, advertisements from adjacent road-side units (RSUs). In case of heavy traffic on highways and urban areas, content delivery time (CDT) can be significantly affected. Increase in CDT can be attributed to high load on the RSU or high volume of broadcasted content which can flood the network. Therefore, this paper suggests a novel caching strategy to improve CDT in high traffic areas and three major contributions have been made: (1) Design and simulation of a caching strategy to decrease the average content delivery time; (2) Evaluation and comparison of caching performance in both urban scenario and highway scenario; (3) Evaluation and comparison of caching performance in single RSU and multiple RSUs. The simulation results show that caching effectively reduces the CDT by 50% in urban scenario and 60-70% in highway scenario.
△ Less
Submitted 5 December, 2021;
originally announced December 2021.
-
Voice-assisted Image Labelling for Endoscopic Ultrasound Classification using Neural Networks
Authors:
Ester Bonmati,
Yipeng Hu,
Alexander Grimwood,
Gavin J. Johnson,
George Goodchild,
Margaret G. Keane,
Kurinchi Gurusamy,
Brian Davidson,
Matthew J. Clarkson,
Stephen P. Pereira,
Dean C. Barratt
Abstract:
Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultraso…
▽ More
Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultrasound training in novices, as well as aiding ultrasound image interpretation in patient with complex pathology for more experienced practitioners. However, the use of deep learning methods requires a large amount of data in order to provide accurate results. Labelling large ultrasound datasets is a challenging task because labels are retrospectively assigned to 2D images without the 3D spatial context available in vivo or that would be inferred while visually tracking structures between frames during the procedure. In this work, we propose a multi-modal convolutional neural network (CNN) architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure. We use a CNN composed of two branches, one for voice data and another for image data, which are joined to predict image labels from the spoken names of anatomical landmarks. The network was trained using recorded verbal comments from expert operators. Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels. We conclude that the addition of spoken commentaries can increase the performance of ultrasound image classification, and eliminate the burden of manually labelling large EUS datasets necessary for deep learning applications.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
NASA Space Robotics Challenge 2 Qualification Round: An Approach to Autonomous Lunar Rover Operations
Authors:
Cagri Kilic,
Bernardo Martinez R. Jr.,
Christopher A. Tatsch,
Jared Beard,
Jared Strader,
Shounak Das,
Derek Ross,
Yu Gu,
Guilherme A. S. Pereira,
Jason N. Gross
Abstract:
Plans for establishing a long-term human presence on the Moon will require substantial increases in robot autonomy and multi-robot coordination to support establishing a lunar outpost. To achieve these objectives, algorithm design choices for the software developments need to be tested and validated for expected scenarios such as autonomous in-situ resource utilization (ISRU), localization in chal…
▽ More
Plans for establishing a long-term human presence on the Moon will require substantial increases in robot autonomy and multi-robot coordination to support establishing a lunar outpost. To achieve these objectives, algorithm design choices for the software developments need to be tested and validated for expected scenarios such as autonomous in-situ resource utilization (ISRU), localization in challenging environments, and multi-robot coordination. However, real-world experiments are extremely challenging and limited for extraterrestrial environment. Also, realistic simulation demonstrations in these environments are still rare and demanded for initial algorithm testing capabilities. To help some of these needs, the NASA Centennial Challenges program established the Space Robotics Challenge Phase 2 (SRC2) which consist of virtual robotic systems in a realistic lunar simulation environment, where a group of mobile robots were tasked with reporting volatile locations within a global map, excavating and transporting these resources, and detecting and localizing a target of interest. The main goal of this article is to share our team's experiences on the design trade-offs to perform autonomous robotic operations in a virtual lunar environment and to share strategies to complete the mission requirements posed by NASA SRC2 competition during the qualification round. Of the 114 teams that registered for participation in the NASA SRC2, team Mountaineers finished as one of only six teams to receive the top qualification round prize.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Towards a Trusted Execution Environment via Reconfigurable FPGA
Authors:
Sérgio Pereira,
David Cerdeira,
Cristiano Rodrigues,
Sandro Pinto
Abstract:
Trusted Execution Environments (TEEs) are used to protect sensitive data and run secure execution for security-critical applications, by providing an environment isolated from the rest of the system. However, over the last few years, TEEs have been proven weak, as either TEEs built upon security-oriented hardware extensions (e.g., Arm TrustZone) or resorting to dedicated secure elements were explo…
▽ More
Trusted Execution Environments (TEEs) are used to protect sensitive data and run secure execution for security-critical applications, by providing an environment isolated from the rest of the system. However, over the last few years, TEEs have been proven weak, as either TEEs built upon security-oriented hardware extensions (e.g., Arm TrustZone) or resorting to dedicated secure elements were exploited multiple times. In this project, we introduce Trusted Execution Environments On-Demand (TEEOD), a novel TEE design that leverages the programmable logic (PL) in the heterogeneous system on chips (SoC) as the secure execution environment. Unlike other TEE designs, TEEOD can provide high-bandwidth connections and physical on-chip isolation. We implemented a proof-of-concept (PoC) implementation targeting an Ultra96-V2 platform. The conducted evaluation demonstrated TEEOD can host up to 6 simultaneous enclaves with a resource usage per enclave of 7.0%, 3.8%, and 15.3% of the total LUTs, FFs, and BRAMS, respectively. To demonstrate the practicability of TEEOD in real-world applications, we successfully run a legacy open-source Bitcoin wallet.
△ Less
Submitted 8 July, 2021;
originally announced July 2021.
-
Barriers and Opportunities to Accessible Social Media Content Authoring
Authors:
Letícia Seixas Pereira,
José Coelho,
André Rodrigues,
João Guerreiro,
Tiago Guerreiro,
Carlos Duarte
Abstract:
User-generated content plays a key role in social networking, allowing a more active participation, socialisation, and collaboration among users. In particular, media content has been gaining a lot of ground, allowing users to express themselves through different types of formats such as images, GIFs and videos. The majority of this growing type of online content remains inaccessible to a part of…
▽ More
User-generated content plays a key role in social networking, allowing a more active participation, socialisation, and collaboration among users. In particular, media content has been gaining a lot of ground, allowing users to express themselves through different types of formats such as images, GIFs and videos. The majority of this growing type of online content remains inaccessible to a part of the population, despite available tools to mitigate this source of exclusion. We sought to understand how people are perceiving these online contents in their networks and how to support tools are being used. To do so, we performed an online survey of 258 social network users and a follow-up interview conducted with 20 of them - 7 of them self-reporting blind and 13 sighted users without a disability. Results show how the different approaches being employed by major platforms are still not sufficient to properly address this issue. Our findings reveal that mainstream users are not aware of the possibility and the benefits of adopting accessible practices. From the general perspectives of end-users experiencing accessible practices, concerning barriers encountered, and motivational factors, we also discuss further approaches to create more user engagement and awareness.
△ Less
Submitted 22 April, 2021;
originally announced April 2021.
-
Visual Servoing Approach for Autonomous UAV Landing on a Moving Vehicle
Authors:
Azarakhsh Keipour,
Guilherme A. S. Pereira,
Rogerio Bonatti,
Rohit Garg,
Puru Rastogi,
Geetesh Dubey,
Sebastian Scherer
Abstract:
Many aerial robotic applications require the ability to land on moving platforms, such as delivery trucks and marine research boats. We present a method to autonomously land an Unmanned Aerial Vehicle on a moving vehicle. A visual servoing controller approaches the ground vehicle using velocity commands calculated directly in image space. The control laws generate velocity commands in all three di…
▽ More
Many aerial robotic applications require the ability to land on moving platforms, such as delivery trucks and marine research boats. We present a method to autonomously land an Unmanned Aerial Vehicle on a moving vehicle. A visual servoing controller approaches the ground vehicle using velocity commands calculated directly in image space. The control laws generate velocity commands in all three dimensions, eliminating the need for a separate height controller. The method has shown the ability to approach and land on the moving deck in simulation, indoor and outdoor environments, and compared to the other available methods, it has provided the fastest landing approach. Unlike many existing methods for landing on fast-moving platforms, this method does not rely on additional external setups, such as RTK, motion capture system, ground station, offboard processing, or communication with the vehicle, and it requires only the minimal set of hardware and localization sensors. The videos and source codes are also provided.
△ Less
Submitted 26 December, 2022; v1 submitted 2 April, 2021;
originally announced April 2021.
-
Zero-shot super-resolution with a physically-motivated downsampling kernel for endomicroscopy
Authors:
Agnieszka Barbara Szczotka,
Dzhoshkun Ismail Shakir,
Matthew J. Clarkson,
Stephen P. Pereira,
Tom Vercauteren
Abstract:
Super-resolution (SR) methods have seen significant advances thanks to the development of convolutional neural networks (CNNs). CNNs have been successfully employed to improve the quality of endomicroscopy imaging. Yet, the inherent limitation of research on SR in endomicroscopy remains the lack of ground truth high-resolution (HR) images, commonly used for both supervised training and reference-b…
▽ More
Super-resolution (SR) methods have seen significant advances thanks to the development of convolutional neural networks (CNNs). CNNs have been successfully employed to improve the quality of endomicroscopy imaging. Yet, the inherent limitation of research on SR in endomicroscopy remains the lack of ground truth high-resolution (HR) images, commonly used for both supervised training and reference-based image quality assessment (IQA). Therefore, alternative methods, such as unsupervised SR are being explored. To address the need for non-reference image quality improvement, we designed a novel zero-shot super-resolution (ZSSR) approach that relies only on the endomicroscopy data to be processed in a self-supervised manner without the need for ground-truth HR images. We tailored the proposed pipeline to the idiosyncrasies of endomicroscopy by introducing both: a physically-motivated Voronoi downscaling kernel accounting for the endomicroscope's irregular fibre-based sampling pattern, and realistic noise patterns. We also took advantage of video sequences to exploit a sequence of images for self-supervised zero-shot image quality improvement. We run ablation studies to assess our contribution in regards to the downscaling kernel and noise simulation. We validate our methodology on both synthetic and original data. Synthetic experiments were assessed with reference-based IQA, while our results for original images were evaluated in a user study conducted with both expert and non-expert observers. The results demonstrated superior performance in image quality of ZSSR reconstructions in comparison to the baseline method. The ZSSR is also competitive when compared to supervised single-image SR, especially being the preferred reconstruction technique by experts.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Real-Time Ellipse Detection for Robotics Applications
Authors:
Azarakhsh Keipour,
Guilherme A. S. Pereira,
Sebastian Scherer
Abstract:
We propose a new algorithm for real-time detection and tracking of elliptic patterns suitable for real-world robotics applications. The method fits ellipses to each contour in the image frame and rejects ellipses that do not yield a good fit. The resulting detection and tracking method is lightweight enough to be used on robots' resource-limited onboard computers, can deal with lighting variations…
▽ More
We propose a new algorithm for real-time detection and tracking of elliptic patterns suitable for real-world robotics applications. The method fits ellipses to each contour in the image frame and rejects ellipses that do not yield a good fit. The resulting detection and tracking method is lightweight enough to be used on robots' resource-limited onboard computers, can deal with lighting variations and detect the pattern even when the view is partial. The method is tested on an example application of an autonomous UAV landing on a fast-moving vehicle to show its performance indoors, outdoors, and in simulation on a real-world robotics task. The comparison with other well-known ellipse detection methods shows that our proposed algorithm outperforms other methods with the F1 score of 0.981 on a dataset with over 1500 frames. The videos of experiments, the source codes, and the collected dataset are provided with the paper at https://theairlab.org/landing-on-vehicle .
△ Less
Submitted 8 July, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Hyperspherical embedding for novel class classification
Authors:
Rafael S. Pereira,
Alexis Joly,
Patrick Valduriez,
Fabio Porto
Abstract:
Deep learning models have become increasingly useful in many different industries. On the domain of image classification, convolutional neural networks proved the ability to learn robust features for the closed set problem, as shown in many different datasets, such as MNIST FASHIONMNIST, CIFAR10, CIFAR100, and IMAGENET. These approaches use deep neural networks with dense layers with softmax activ…
▽ More
Deep learning models have become increasingly useful in many different industries. On the domain of image classification, convolutional neural networks proved the ability to learn robust features for the closed set problem, as shown in many different datasets, such as MNIST FASHIONMNIST, CIFAR10, CIFAR100, and IMAGENET. These approaches use deep neural networks with dense layers with softmax activation functions in order to learn features that can separate classes in a latent space. However, this traditional approach is not useful for identifying classes unseen on the training set, known as the open set problem. A similar problem occurs in scenarios involving learning on small data. To tackle both problems, few-shot learning has been proposed. In particular, metric learning learns features that obey constraints of a metric distance in the latent space in order to perform classification. However, while this approach proves to be useful for the open set problem, current implementation requires pair-wise training, where both positive and negative examples of similar images are presented during the training phase, which limits the applicability of these approaches in large data or large class scenarios given the combinatorial nature of the possible inputs.In this paper, we present a constraint-based approach applied to the representations in the latent space under the normalized softmax loss, proposed by[18]. We experimentally validate the proposed approach for the classification of unseen classes on different datasets using both metric learning and the normalized softmax loss, on disjoint and joint scenarios. Our results show that not only our proposed strategy can be efficiently trained on larger set of classes, as it does not require pairwise learning, but also present better classification results than the metric learning strategies surpassing its accuracy by a significant margin.
△ Less
Submitted 28 February, 2022; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Multi-stage Deep Layer Aggregation for Brain Tumor Segmentation
Authors:
Carlos A. Silva,
Adriano Pinto,
Sérgio Pereira,
Ana Lopes
Abstract:
Gliomas are among the most aggressive and deadly brain tumors. This paper details the proposed Deep Neural Network architecture for brain tumor segmentation from Magnetic Resonance Images. The architecture consists of a cascade of three Deep Layer Aggregation neural networks, where each stage elaborates the response using the feature maps and the probabilities of the previous stage, and the MRI ch…
▽ More
Gliomas are among the most aggressive and deadly brain tumors. This paper details the proposed Deep Neural Network architecture for brain tumor segmentation from Magnetic Resonance Images. The architecture consists of a cascade of three Deep Layer Aggregation neural networks, where each stage elaborates the response using the feature maps and the probabilities of the previous stage, and the MRI channels as inputs. The neuroimaging data are part of the publicly available Brain Tumor Segmentation (BraTS) 2020 challenge dataset, where we evaluated our proposal in the BraTS 2020 Validation and Test sets. In the Test set, the experimental results achieved a Dice score of 0.8858, 0.8297 and 0.7900, with an Hausdorff Distance of 5.32 mm, 22.32 mm and 20.44 mm for the whole tumor, core tumor and enhanced tumor, respectively.
△ Less
Submitted 2 January, 2021;
originally announced January 2021.
-
Combining unsupervised and supervised learning for predicting the final stroke lesion
Authors:
Adriano Pinto,
Sérgio Pereira,
Raphael Meier,
Roland Wiest,
Victor Alves,
Mauricio Reyes,
Carlos A. Silva
Abstract:
Predicting the final ischaemic stroke lesion provides crucial information regarding the volume of salvageable hypoperfused tissue, which helps physicians in the difficult decision-making process of treatment planning and intervention. Treatment selection is influenced by clinical diagnosis, which requires delineating the stroke lesion, as well as characterising cerebral blood flow dynamics using n…
▽ More
Predicting the final ischaemic stroke lesion provides crucial information regarding the volume of salvageable hypoperfused tissue, which helps physicians in the difficult decision-making process of treatment planning and intervention. Treatment selection is influenced by clinical diagnosis, which requires delineating the stroke lesion, as well as characterising cerebral blood flow dynamics using neuroimaging acquisitions. Nonetheless, predicting the final stroke lesion is an intricate task, due to the variability in lesion size, shape, location and the underlying cerebral haemodynamic processes that occur after the ischaemic stroke takes place. Moreover, since elapsed time between stroke and treatment is related to the loss of brain tissue, assessing and predicting the final stroke lesion needs to be performed in a short period of time, which makes the task even more complex. Therefore, there is a need for automatic methods that predict the final stroke lesion and support physicians in the treatment decision process. We propose a fully automatic deep learning method based on unsupervised and supervised learning to predict the final stroke lesion after 90 days. Our aim is to predict the final stroke lesion location and extent, taking into account the underlying cerebral blood flow dynamics that can influence the prediction. To achieve this, we propose a two-branch Restricted Boltzmann Machine, which provides specialized data-driven features from different sets of standard parametric Magnetic Resonance Imaging maps. These data-driven feature maps are then combined with the parametric Magnetic Resonance Imaging maps, and fed to a Convolutional and Recurrent Neural Network architecture. We evaluated our proposal on the publicly available ISLES 2017 testing dataset, reaching a Dice score of 0.38, Hausdorff Distance of 29.21 mm, and Average Symmetric Surface Distance of 5.52 mm.
△ Less
Submitted 2 January, 2021;
originally announced January 2021.
-
A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest
Authors:
S. Mucesh,
W. G. Hartley,
A. Palmese,
O. Lahav,
L. Whiteway,
A. F. L. Bluck,
A. Alarcon,
A. Amon,
K. Bechtol,
G. M. Bernstein,
A. Carnero Rosell,
M. Carrasco Kind,
A. Choi,
K. Eckert,
S. Everett,
D. Gruen,
R. A. Gruendl,
I. Harrison,
E. M. Huff,
N. Kuropatkin,
I. Sevilla-Noarbe,
E. Sheldon,
B. Yanny,
M. Aguena,
S. Allam
, et al. (50 additional authors not shown)
Abstract:
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep phot…
▽ More
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies.
△ Less
Submitted 19 February, 2021; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Adaptive feature recombination and recalibration for semantic segmentation with Fully Convolutional Networks
Authors:
Sergio Pereira,
Adriano Pinto,
Joana Amorim,
Alexandrine Ribeiro,
Victor Alves,
Carlos A. Silva
Abstract:
Fully Convolutional Networks have been achieving remarkable results in image semantic segmentation, while being efficient. Such efficiency results from the capability of segmenting several voxels in a single forward pass. So, there is a direct spatial correspondence between a unit in a feature map and the voxel in the same location. In a convolutional layer, the kernel spans over all channels and…
▽ More
Fully Convolutional Networks have been achieving remarkable results in image semantic segmentation, while being efficient. Such efficiency results from the capability of segmenting several voxels in a single forward pass. So, there is a direct spatial correspondence between a unit in a feature map and the voxel in the same location. In a convolutional layer, the kernel spans over all channels and extracts information from them. We observe that linear recombination of feature maps by increasing the number of channels followed by compression may enhance their discriminative power. Moreover, not all feature maps have the same relevance for the classes being predicted. In order to learn the inter-channel relationships and recalibrate the channels to suppress the less relevant ones, Squeeze and Excitation blocks were proposed in the context of image classification with Convolutional Neural Networks. However, this is not well adapted for segmentation with Fully Convolutional Networks since they segment several objects simultaneously, hence a feature map may contain relevant information only in some locations. In this paper, we propose recombination of features and a spatially adaptive recalibration block that is adapted for semantic segmentation with Fully Convolutional Networks - the SegSE block. Feature maps are recalibrated by considering the cross-channel information together with spatial relevance. Experimental results indicate that Recombination and Recalibration improve the results of a competitive baseline, and generalize across three different problems: brain tumor segmentation, stroke penumbra estimation, and ischemic stroke lesion outcome prediction. The obtained results are competitive or outperform the state of the art in the three applications.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
Learning from Irregularly Sampled Data for Endomicroscopy Super-resolution: A Comparative Study of Sparse and Dense Approaches
Authors:
Agnieszka Barbara Szczotka,
Dzhoshkun Ismail Shakir,
DanieleRavi,
Matthew J. Clarkson,
Stephen P. Pereira,
Tom Vercauteren
Abstract:
Purpose: Probe-based Confocal Laser Endomicroscopy (pCLE) enables performing an optical biopsy, providing real-time microscopic images, via a probe. pCLE probes consist of multiple optical fibres arranged in a bundle, which taken together generate signals in an irregularly sampled pattern. Current pCLE reconstruction is based on interpolating irregular signals onto an over-sampled Cartesian grid,…
▽ More
Purpose: Probe-based Confocal Laser Endomicroscopy (pCLE) enables performing an optical biopsy, providing real-time microscopic images, via a probe. pCLE probes consist of multiple optical fibres arranged in a bundle, which taken together generate signals in an irregularly sampled pattern. Current pCLE reconstruction is based on interpolating irregular signals onto an over-sampled Cartesian grid, using a naive linear interpolation. It was shown that Convolutional Neural Networks (CNNs) could improve pCLE image quality. Although classical CNNs were applied to pCLE, input data were limited to reconstructed images in contrast to irregular data produced by pCLE. Methods: We compare pCLE reconstruction and super-resolution (SR) methods taking irregularly sampled or reconstructed pCLE images as input. We also propose to embed a Nadaraya-Watson (NW) kernel regression into the CNN framework as a novel trainable CNN layer. Using the NW layer and exemplar-based super-resolution, we design an NWNetSR architecture that allows for reconstructing high-quality pCLE images directly from the irregularly sampled input data. We created synthetic sparse pCLE images to evaluate our methodology. Results: The results were validated through an image quality assessment based on a combination of the following metrics: Peak signal-to-noise ratio, the Structural Similarity Index. Conclusion: Both dense and sparse CNNs outperform the reconstruction method currently used in the clinic. The main contributions of our study are a comparison of sparse and dense approach in pCLE image reconstruction, implementing trainable generalised NW kernel regression, and adaptation of synthetic data for training pCLE SR.
△ Less
Submitted 29 November, 2019;
originally announced November 2019.
-
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Authors:
Maxim Naumov,
Dheevatsa Mudigere,
Hao-Jun Michael Shi,
Jianyu Huang,
Narayanan Sundaraman,
Jongsoo Park,
Xiaodong Wang,
Udit Gupta,
Carole-Jean Wu,
Alisson G. Azzolini,
Dmytro Dzhulgakov,
Andrey Mallevich,
Ilia Cherniavskii,
Yinghai Lu,
Raghuraman Krishnamoorthi,
Ansha Yu,
Volodymyr Kondratenko,
Stephanie Pereira,
Xianjie Chen,
Wenlin Chen,
Vijay Rao,
Bill Jia,
Liang Xiong,
Misha Smelyanskiy
Abstract:
With the advent of deep learning, neural network-based recommendation models have emerged as an important tool for tackling personalization and recommendation tasks. These networks differ significantly from other deep learning networks due to their need to handle categorical features and are not well studied or understood. In this paper, we develop a state-of-the-art deep learning recommendation m…
▽ More
With the advent of deep learning, neural network-based recommendation models have emerged as an important tool for tackling personalization and recommendation tasks. These networks differ significantly from other deep learning networks due to their need to handle categorical features and are not well studied or understood. In this paper, we develop a state-of-the-art deep learning recommendation model (DLRM) and provide its implementation in both PyTorch and Caffe2 frameworks. In addition, we design a specialized parallelization scheme utilizing model parallelism on the embedding tables to mitigate memory constraints while exploiting data parallelism to scale-out compute from the fully-connected layers. We compare DLRM against existing recommendation models and characterize its performance on the Big Basin AI platform, demonstrating its usefulness as a benchmark for future algorithmic experimentation and system co-design.
△ Less
Submitted 31 May, 2019;
originally announced June 2019.
-
Adversarial training with cycle consistency for unsupervised super-resolution in endomicroscopy
Authors:
Daniele Ravì,
Agnieszka Barbara Szczotka,
Stephen P Pereira,
Tom Vercauteren
Abstract:
In recent years, endomicroscopy has become increasingly used for diagnostic purposes and interventional guidance. It can provide intraoperative aids for real-time tissue characterization and can help to perform visual investigations aimed for example to discover epithelial cancers. Due to physical constraints on the acquisition process, endomicroscopy images, still today have a low number of infor…
▽ More
In recent years, endomicroscopy has become increasingly used for diagnostic purposes and interventional guidance. It can provide intraoperative aids for real-time tissue characterization and can help to perform visual investigations aimed for example to discover epithelial cancers. Due to physical constraints on the acquisition process, endomicroscopy images, still today have a low number of informative pixels which hampers their quality. Post-processing techniques, such as Super-Resolution (SR), are a potential solution to increase the quality of these images. SR techniques are often supervised, requiring aligned pairs of low-resolution (LR) and high-resolution (HR) images patches to train a model. However, in our domain, the lack of HR images hinders the collection of such pairs and makes supervised training unsuitable. For this reason, we propose an unsupervised SR framework based on an adversarial deep neural network with a physically-inspired cycle consistency, designed to impose some acquisition properties on the super-resolved images. Our framework can exploit HR images, regardless of the domain where they are coming from, to transfer the quality of the HR images to the initial LR images. This property can be particularly useful in all situations where pairs of LR/HR are not available during the training. Our quantitative analysis, validated using a database of 238 endomicroscopy video sequences from 143 patients, shows the ability of the pipeline to produce convincing super-resolved images. A Mean Opinion Score (MOS) study also confirms this quantitative image quality assessment.
△ Less
Submitted 6 February, 2019; v1 submitted 21 January, 2019;
originally announced January 2019.
-
Retinal vessel segmentation based on Fully Convolutional Neural Networks
Authors:
Américo Oliveira,
Sérgio Pereira,
Carlos A. Silva
Abstract:
The retinal vascular condition is a reliable biomarker of several ophthalmologic and cardiovascular diseases, so automatic vessel segmentation may be crucial to diagnose and monitor them. In this paper, we propose a novel method that combines the multiscale analysis provided by the Stationary Wavelet Transform with a multiscale Fully Convolutional Neural Network to cope with the varying width and…
▽ More
The retinal vascular condition is a reliable biomarker of several ophthalmologic and cardiovascular diseases, so automatic vessel segmentation may be crucial to diagnose and monitor them. In this paper, we propose a novel method that combines the multiscale analysis provided by the Stationary Wavelet Transform with a multiscale Fully Convolutional Neural Network to cope with the varying width and direction of the vessel structure in the retina. Our proposal uses rotation operations as the basis of a joint strategy for both data augmentation and prediction, which allows us to explore the information learned during training to refine the segmentation. The method was evaluated on three publicly available databases, achieving an average accuracy of 0.9576, 0.9694, and 0.9653, and average area under the ROC curve of 0.9821, 0.9905, and 0.9855 on the DRIVE, STARE, and CHASE_DB1 databases, respectively. It also appears to be robust to the training set and to the inter-rater variability, which shows its potential for real-world applications.
△ Less
Submitted 19 December, 2018; v1 submitted 17 December, 2018;
originally announced December 2018.
-
Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge
Authors:
Spyridon Bakas,
Mauricio Reyes,
Andras Jakab,
Stefan Bauer,
Markus Rempfler,
Alessandro Crimi,
Russell Takeshi Shinohara,
Christoph Berger,
Sung Min Ha,
Martin Rozycki,
Marcel Prastawa,
Esther Alberts,
Jana Lipkova,
John Freymann,
Justin Kirby,
Michel Bilello,
Hassan Fathallah-Shaykh,
Roland Wiest,
Jan Kirschke,
Benedikt Wiestler,
Rivka Colen,
Aikaterini Kotrotsou,
Pamela Lamontagne,
Daniel Marcus,
Mikhail Milchenko
, et al. (402 additional authors not shown)
Abstract:
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem…
▽ More
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.
△ Less
Submitted 23 April, 2019; v1 submitted 5 November, 2018;
originally announced November 2018.
-
Automatic brain tumor grading from MRI data using convolutional neural networks and quality assessment
Authors:
Sergio Pereira,
Raphael Meier,
Victor Alves,
Mauricio Reyes,
Carlos A. Silva
Abstract:
Glioblastoma Multiforme is a high grade, very aggressive, brain tumor, with patients having a poor prognosis. Lower grade gliomas are less aggressive, but they can evolve into higher grade tumors over time. Patient management and treatment can vary considerably with tumor grade, ranging from tumor resection followed by a combined radio- and chemotherapy to a "wait and see" approach. Hence, tumor g…
▽ More
Glioblastoma Multiforme is a high grade, very aggressive, brain tumor, with patients having a poor prognosis. Lower grade gliomas are less aggressive, but they can evolve into higher grade tumors over time. Patient management and treatment can vary considerably with tumor grade, ranging from tumor resection followed by a combined radio- and chemotherapy to a "wait and see" approach. Hence, tumor grading is important for adequate treatment planning and monitoring. The gold standard for tumor grading relies on histopathological diagnosis of biopsy specimens. However, this procedure is invasive, time consuming, and prone to sampling error. Given these disadvantages, automatic tumor grading from widely used MRI protocols would be clinically important, as a way to expedite treatment planning and assessment of tumor evolution. In this paper, we propose to use Convolutional Neural Networks for predicting tumor grade directly from imaging data. In this way, we overcome the need for expert annotations of regions of interest. We evaluate two prediction approaches: from the whole brain, and from an automatically defined tumor region. Finally, we employ interpretability methodologies as a quality assurance stage to check if the method is using image regions indicative of tumor grade for classification.
△ Less
Submitted 25 September, 2018;
originally announced September 2018.
-
Enhancing clinical MRI Perfusion maps with data-driven maps of complementary nature for lesion outcome prediction
Authors:
Adriano Pinto,
Sergio Pereira,
Raphael Meier,
Victor Alves,
Roland Wiest,
Carlos A. Silva,
Mauricio Reyes
Abstract:
Stroke is the second most common cause of death in developed countries, where rapid clinical intervention can have a major impact on a patient's life. To perform the revascularization procedure, the decision making of physicians considers its risks and benefits based on multi-modal MRI and clinical experience. Therefore, automatic prediction of the ischemic stroke lesion outcome has the potential…
▽ More
Stroke is the second most common cause of death in developed countries, where rapid clinical intervention can have a major impact on a patient's life. To perform the revascularization procedure, the decision making of physicians considers its risks and benefits based on multi-modal MRI and clinical experience. Therefore, automatic prediction of the ischemic stroke lesion outcome has the potential to assist the physician towards a better stroke assessment and information about tissue outcome. Typically, automatic methods consider the information of the standard kinetic models of diffusion and perfusion MRI (e.g. Tmax, TTP, MTT, rCBF, rCBV) to perform lesion outcome prediction. In this work, we propose a deep learning method to fuse this information with an automated data selection of the raw 4D PWI image information, followed by a data-driven deep-learning modeling of the underlying blood flow hemodynamics. We demonstrate the ability of the proposed approach to improve prediction of tissue at risk before therapy, as compared to only using the standard clinical perfusion maps, hence suggesting on the potential benefits of the proposed data-driven raw perfusion data modelling approach.
△ Less
Submitted 12 June, 2018;
originally announced June 2018.
-
Adaptive feature recombination and recalibration for semantic segmentation: application to brain tumor segmentation in MRI
Authors:
Sérgio Pereira,
Victor Alves,
Carlos A. Silva
Abstract:
Convolutional neural networks (CNNs) have been successfully used for brain tumor segmentation, specifically, fully convolutional networks (FCNs). FCNs can segment a set of voxels at once, having a direct spatial correspondence between units in feature maps (FMs) at a given location and the corresponding classified voxels. In convolutional layers, FMs are merged to create new FMs, so, channel combi…
▽ More
Convolutional neural networks (CNNs) have been successfully used for brain tumor segmentation, specifically, fully convolutional networks (FCNs). FCNs can segment a set of voxels at once, having a direct spatial correspondence between units in feature maps (FMs) at a given location and the corresponding classified voxels. In convolutional layers, FMs are merged to create new FMs, so, channel combination is crucial. However, not all FMs have the same relevance for a given class. Recently, in classification problems, Squeeze-and-Excitation (SE) blocks have been proposed to re-calibrate FMs as a whole, and suppress the less informative ones. However, this is not optimal in FCN due to the spatial correspondence between units and voxels. In this article, we propose feature recombination through linear expansion and compression to create more complex features for semantic segmentation. Additionally, we propose a segmentation SE (SegSE) block for feature recalibration that collects contextual information, while maintaining the spatial meaning. Finally, we evaluate the proposed methods in brain tumor segmentation, using publicly available data.
△ Less
Submitted 6 June, 2018;
originally announced June 2018.
-
In-Person and Remote Participation Review at IETF: Collaborating Without Borders
Authors:
Lucas Andrade,
Juliao Braga,
Stefany Pereira,
Rafael Roque,
Marcelo Santos
Abstract:
The IETF has been acting as one of the main actors when discussing standardization of protocols and good practices on the Internet. Collaborating with the IETF community can be complex and distant for many researchers and industry members because of the financial aspect to travel to the meeting. However, it notes the collaboration between industry and academia is actively and progressively develop…
▽ More
The IETF has been acting as one of the main actors when discussing standardization of protocols and good practices on the Internet. Collaborating with the IETF community can be complex and distant for many researchers and industry members because of the financial aspect to travel to the meeting. However, it notes the collaboration between industry and academia is actively and progressively developing and refining standards within the IETF. One of the incentives for the increased participation in IETF meetings is because it is being transmitted in real time since 2015, allowing for voice and chat interaction of remote participants. Thus, in this paper, we have as objectives to give a brief vision about how to collaborate with the IETF and to analyze the importance of this new form of participation of the face-to-face meetings that has been growing in recent years.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.
-
Effective deep learning training for single-image super-resolution in endomicroscopy exploiting video-registration-based reconstruction
Authors:
Daniele Ravì,
Agnieszka Barbara Szczotka,
Dzhoshkun Ismail Shakir,
Stephen P Pereira,
Tom Vercauteren
Abstract:
Purpose: Probe-based Confocal Laser Endomicroscopy (pCLE) is a recent imaging modality that allows performing in vivo optical biopsies. The design of pCLE hardware, and its reliance on an optical fibre bundle, fundamentally limits the image quality with a few tens of thousands fibres, each acting as the equivalent of a single-pixel detector, assembled into a single fibre bundle. Video-registration…
▽ More
Purpose: Probe-based Confocal Laser Endomicroscopy (pCLE) is a recent imaging modality that allows performing in vivo optical biopsies. The design of pCLE hardware, and its reliance on an optical fibre bundle, fundamentally limits the image quality with a few tens of thousands fibres, each acting as the equivalent of a single-pixel detector, assembled into a single fibre bundle. Video-registration techniques can be used to estimate high-resolution (HR) images by exploiting the temporal information contained in a sequence of low-resolution (LR) images. However, the alignment of LR frames, required for the fusion, is computationally demanding and prone to artefacts. Methods: In this work, we propose a novel synthetic data generation approach to train exemplar-based Deep Neural Networks (DNNs). HR pCLE images with enhanced quality are recovered by the models trained on pairs of estimated HR images (generated by the video-registration algorithm) and realistic synthetic LR images. Performance of three different state-of-the-art DNNs techniques were analysed on a Smart Atlas database of 8806 images from 238 pCLE video sequences. The results were validated through an extensive Image Quality Assessment (IQA) that takes into account different quality scores, including a Mean Opinion Score (MOS). Results: Results indicate that the proposed solution produces an effective improvement in the quality of the obtained reconstructed image. Conclusion: The proposed training strategy and associated DNNs allows us to perform convincing super-resolution of pCLE images.
△ Less
Submitted 23 March, 2018;
originally announced March 2018.
-
On the Implementation of a Scalable Simulator for Multiscale Hybrid-Mixed Methods
Authors:
Antonio Tadeu A. Gomes,
Weslley S. Pereira,
Frederic Valentin,
Diego Paredes
Abstract:
The family of Multiscale Hybrid-Mixed (MHM) finite element methods has received considerable attention from the mathematics and engineering community in the last few years. The MHM methods allow solving highly heterogeneous problems on coarse meshes while providing solutions with high-order precision. It embeds independent local problems which are responsible for upscaling unresolved scales into t…
▽ More
The family of Multiscale Hybrid-Mixed (MHM) finite element methods has received considerable attention from the mathematics and engineering community in the last few years. The MHM methods allow solving highly heterogeneous problems on coarse meshes while providing solutions with high-order precision. It embeds independent local problems which are responsible for upscaling unresolved scales into the numerical solution. These local contributions are brought together through a global problem defined on the skeleton of the coarse partition. Since the local problems are completely independent, they can be easily computed in parallel. In this paper, we present two simulator prototypes specifically crafted for the MHM methods, which adopt two different implementation strategies: (i) a multi-programming language approach, each language tackling different simulation issues; and (ii) a classical, single-programming language approach. Specifically, we use C++ for numerical computation of the global and local problems in a modular way; for process distribution in the simulator, we adopt the Erlang concurrent language in the first approach, and the MPI standard in the second approach. The aim of exploring these different approaches is twofold: (i) allow for the deployment of the simulator both in high-performance computing (with MPI) and in cloud computing environments (with Erlang); and (ii) pave the way for further exploration of quality attributes related to software productivity and fault-tolerance, which are key to Exascale systems. We present a performance evaluation of the two simulator prototypes taking into account their efficiency.
△ Less
Submitted 30 March, 2017;
originally announced March 2017.
-
ITCM: A Real Time Internet Traffic Classifier Monitor
Authors:
Silas Santiago Lopes Pereira,
José Everardo Bessa Maia,
Jorge Luiz de Castro e Silva
Abstract:
The continual growth of high speed networks is a challenge for real-time network analysis systems. The real time traffic classification is an issue for corporations and ISPs (Internet Service Providers). This work presents the design and implementation of a real time flow-based network traffic classification system. The classifier monitor acts as a pipeline consisting of three modules: packet capt…
▽ More
The continual growth of high speed networks is a challenge for real-time network analysis systems. The real time traffic classification is an issue for corporations and ISPs (Internet Service Providers). This work presents the design and implementation of a real time flow-based network traffic classification system. The classifier monitor acts as a pipeline consisting of three modules: packet capture and pre-processing, flow reassembly, and classification with Machine Learning (ML). The modules are built as concurrent processes with well defined data interfaces between them so that any module can be improved and updated independently. In this pipeline, the flow reassembly function becomes the bottleneck of the performance. In this implementation, was used a efficient method of reassembly which results in a average delivery delay of 0.49 seconds, approximately. For the classification module, the performances of the K-Nearest Neighbor (KNN), C4.5 Decision Tree, Naive Bayes (NB), Flexible Naive Bayes (FNB) and AdaBoost Ensemble Learning Algorithm are compared in order to validate our approach.
△ Less
Submitted 6 January, 2015;
originally announced January 2015.
-
Programing Using High Level Design With Python and FORTRAN: A Study Case in Astrophysics
Authors:
Eduardo dos Santos Pereira,
Oswaldo D. Miranda
Abstract:
In this work, we present a short review about the high level design methodology (HLDM), that is based on the use of very high level (VHL) programing language as main, and the use of the intermediate level (IL) language only for the critical processing time. The languages used are Python (VHL) and FORTRAN (IL). Moreover, this methodology, making use of the oriented object programing (OOP), permits…
▽ More
In this work, we present a short review about the high level design methodology (HLDM), that is based on the use of very high level (VHL) programing language as main, and the use of the intermediate level (IL) language only for the critical processing time. The languages used are Python (VHL) and FORTRAN (IL). Moreover, this methodology, making use of the oriented object programing (OOP), permits to produce a readable, portable and reusable code. Also is presented the concept of computational framework, that naturally appears from the OOP paradigm. As an example, we present the framework called PYGRAWC (Python framework for Gravitational Waves from Cosmological origin). Even more, we show that the use of HLDM with Python and FORTRAN produces a powerful tool for solving astrophysical problems.
△ Less
Submitted 16 July, 2012;
originally announced July 2012.
-
OGCOSMO: An auxiliary tool for the study of the Universe within hierarchical scenario of structure formation
Authors:
Eduardo dos Santos Pereira,
Oswaldo D. Miranda
Abstract:
In this work is presented the software OGCOSMO. This program was written using high level design methodology (HLDM), that is based on the use of very high level (VHL) programing language as main, and the use of the intermediate level (IL) language only for the critical processing time. The languages used are PYTHON (VHL) and FORTRAN (IL). The core of OGCOSMO is a package called OGC{\_}lib. This pa…
▽ More
In this work is presented the software OGCOSMO. This program was written using high level design methodology (HLDM), that is based on the use of very high level (VHL) programing language as main, and the use of the intermediate level (IL) language only for the critical processing time. The languages used are PYTHON (VHL) and FORTRAN (IL). The core of OGCOSMO is a package called OGC{\_}lib. This package contains a group of modules for the study of cosmological and astrophysical processes, such as: comoving distance, relation between redshift and time, cosmic star formation rate, number density of dark matter haloes and mass function of supermassive black holes (SMBHs). The software is under development and some new features will be implemented for the research of stochastic background of gravitational waves (GWs) generated by: stellar collapse to form black holes, binary systems of SMBHs. Even more, we show that the use of HLDM with PYTHON and FORTRAN is a powerful tool for producing astrophysical softwares.
△ Less
Submitted 16 July, 2012;
originally announced July 2012.
-
The use of the GARP genetic algorithm and internet grid computing in the Lifemapper world atlas of species biodiversity
Authors:
David R. B. Stockwell,
James H. Beach,
Aimee Stewart,
Gregory Vorontsov,
David Vieglais,
Ricardo Scachetti Pereira
Abstract:
Lifemapper (http://www.lifemapper.org) is a predictive electronic atlas of the Earth's biological biodiversity. Using a screensaver version of the GARP genetic algorithm for modeling species distributions, Lifemapper harnesses vast computing resources through 'volunteers' PCs similar to SETI@home, to develop models of the distribution of the worlds fauna and flora. The Lifemapper project's prima…
▽ More
Lifemapper (http://www.lifemapper.org) is a predictive electronic atlas of the Earth's biological biodiversity. Using a screensaver version of the GARP genetic algorithm for modeling species distributions, Lifemapper harnesses vast computing resources through 'volunteers' PCs similar to SETI@home, to develop models of the distribution of the worlds fauna and flora. The Lifemapper project's primary goal is to provide an up to date and comprehensive database of species maps and prediction models (i.e. a fauna and flora of the world) using available data on species' locations. The models are developed using specimen data from distributed museum collections and an archive of geospatial environmental correlates. A central server maintains a dynamic archive of species maps and models for research, outreach to the general community, and feedback to museum data providers. This paper is a case study in the role, use and justification of a genetic algorithm in development of large-scale environmental informatics infrastructure.
△ Less
Submitted 28 November, 2005;
originally announced November 2005.