Search | arXiv e-print repository

Residual-based Adaptive Huber Loss (RAHL) -- Design of an improved Huber loss for CQI prediction in 5G networks

Authors: Mina Kaviani, Jurandy Almeida, Fabio L. Verdi

Abstract: The Channel Quality Indicator (CQI) plays a pivotal role in 5G networks, optimizing infrastructure dynamically to ensure high Quality of Service (QoS). Recent research has focused on improving CQI estimation in 5G networks using machine learning. In this field, the selection of the proper loss function is critical for training an accurate model. Two commonly used loss functions are Mean Squared Er… ▽ More The Channel Quality Indicator (CQI) plays a pivotal role in 5G networks, optimizing infrastructure dynamically to ensure high Quality of Service (QoS). Recent research has focused on improving CQI estimation in 5G networks using machine learning. In this field, the selection of the proper loss function is critical for training an accurate model. Two commonly used loss functions are Mean Squared Error (MSE) and Mean Absolute Error (MAE). Roughly speaking, MSE put more weight on outliers, MAE on the majority. Here, we argue that the Huber loss function is more suitable for CQI prediction, since it combines the benefits of both MSE and MAE. To achieve this, the Huber loss transitions smoothly between MSE and MAE, controlled by a user-defined hyperparameter called delta. However, finding the right balance between sensitivity to small errors (MAE) and robustness to outliers (MSE) by manually choosing the optimal delta is challenging. To address this issue, we propose a novel loss function, named Residual-based Adaptive Huber Loss (RAHL). In RAHL, a learnable residual is added to the delta, enabling the model to adapt based on the distribution of errors in the data. Our approach effectively balances model robustness against outliers while preserving inlier data precision. The widely recognized Long Short-Term Memory (LSTM) model is employed in conjunction with RAHL, showcasing significantly improved results compared to the aforementioned loss functions. The obtained results affirm the superiority of RAHL, offering a promising avenue for enhanced CQI prediction in 5G networks. △ Less

Submitted 26 August, 2024; originally announced August 2024.

Comments: https://sol.sbc.org.br/index.php/sbrc/article/view/29822/29625

arXiv:2408.10720 [pdf, other]

Towards Foundation Models for the Industrial Forecasting of Chemical Kinetics

Authors: Imran Nasim, Joaõ Lucas de Sousa Almeida

Abstract: Scientific Machine Learning is transforming traditional engineering industries by enhancing the efficiency of existing technologies and accelerating innovation, particularly in modeling chemical reactions. Despite recent advancements, the issue of solving stiff chemically reacting problems within computational fluid dynamics remains a significant issue. In this study we propose a novel approach ut… ▽ More Scientific Machine Learning is transforming traditional engineering industries by enhancing the efficiency of existing technologies and accelerating innovation, particularly in modeling chemical reactions. Despite recent advancements, the issue of solving stiff chemically reacting problems within computational fluid dynamics remains a significant issue. In this study we propose a novel approach utilizing a multi-layer-perceptron mixer architecture (MLP-Mixer) to model the time-series of stiff chemical kinetics. We evaluate this method using the ROBER system, a benchmark model in chemical kinetics, to compare its performance with traditional numerical techniques. This study provides insight into the industrial utility of the recently developed MLP-Mixer architecture to model chemical kinetics and provides motivation for such neural architecture to be used as a base for time-series foundation models. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: Accepted into the IEEE CAI 2024 Workshop on Scientific Machine Learning and Its Industrial Applications (SMLIA2024)

arXiv:2408.06341 [pdf, other]

Is it a work or leisure travel? Applying text classification to identify work-related travel on social networks

Authors: Lucas Félix, Washington Cunha, Jussara Almeida

Abstract: In today's digital era, the use of Social Networks (SNs) and Location-Based SNs (LBSNs) has become integral for travelers seeking Points of Interest (POI) and sharing travel experiences. This trend is supported by the fact that a significant majority of American travelers utilize SNs during their trips. However, the abundance of information available on these platforms presents a challenge in iden… ▽ More In today's digital era, the use of Social Networks (SNs) and Location-Based SNs (LBSNs) has become integral for travelers seeking Points of Interest (POI) and sharing travel experiences. This trend is supported by the fact that a significant majority of American travelers utilize SNs during their trips. However, the abundance of information available on these platforms presents a challenge in identifying the best options. To address this issue, Recommender Systems (RS) are commonly employed to suggest POIs based on user history, with the integration of contextual information enhancing the quality of recommendations. Notably, incorporating user travel purpose, which is often overlooked but holds potential in characterizing travelers' behavior, can lead to more tailored recommendations. In this study, we propose a model to predict whether a trip is leisure or work-related, utilizing state-of-the-art Automatic Text Classification (ATC) models such as BERT, RoBERTa, and BART to enhance the understanding of user travel purposes and improve recommendation accuracy in specific travel scenarios. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: 2 pages, 3 tables

Journal ref: NetMob 2023

arXiv:2408.02418 [pdf, other]

Demystifying Spatial Dependence: Interactive Visualizations for Interpreting Local Spatial Autocorrelation

Authors: Lee Mason, Blanaid Hicks, Jonas Almeida

Abstract: The Local Moran's I statistic is a valuable tool for identifying localized patterns of spatial autocorrelation. Understanding these patterns is crucial in spatial analysis, but interpreting the statistic can be difficult. To simplify this process, we introduce three novel visualizations that enhance the interpretation of Local Moran's I results. These visualizations can be interactively linked to… ▽ More The Local Moran's I statistic is a valuable tool for identifying localized patterns of spatial autocorrelation. Understanding these patterns is crucial in spatial analysis, but interpreting the statistic can be difficult. To simplify this process, we introduce three novel visualizations that enhance the interpretation of Local Moran's I results. These visualizations can be interactively linked to one another, and to established visualizations, to offer a more holistic exploration of the results. We provide a JavaScript library with implementations of these new visual elements, along with a web dashboard that demonstrates their integrated use. △ Less

Submitted 5 August, 2024; originally announced August 2024.

arXiv:2407.21233 [pdf]

TMA-Grid: An open-source, zero-footprint web application for FAIR Tissue MicroArray De-arraying

Authors: Aaron Ge, Monjoy Saha, Maire A. Duggan, Petra Lenz, Mustapha Abubakar, Montserrat García-Closas, Jeya Balasubramanian, Jonas S. Almeida, Praphulla MS Bhawsar

Abstract: Background: Tissue Microarrays (TMAs) significantly increase analytical efficiency in histopathology and large-scale epidemiologic studies by allowing multiple tissue cores to be scanned on a single slide. The individual cores can be digitally extracted and then linked to metadata for analysis in a process known as de-arraying. However, TMAs often contain core misalignments and artifacts due to… ▽ More Background: Tissue Microarrays (TMAs) significantly increase analytical efficiency in histopathology and large-scale epidemiologic studies by allowing multiple tissue cores to be scanned on a single slide. The individual cores can be digitally extracted and then linked to metadata for analysis in a process known as de-arraying. However, TMAs often contain core misalignments and artifacts due to assembly errors, which can adversely affect the reliability of the extracted cores during the de-arraying process. Moreover, conventional approaches for TMA de-arraying rely on desktop solutions.Therefore, a robust yet flexible de-arraying method is crucial to account for these inaccuracies and ensure effective downstream analyses. Results: We developed TMA-Grid, an in-browser, zero-footprint, interactive web application for TMA de-arraying. This web application integrates a convolutional neural network for precise tissue segmentation and a grid estimation algorithm to match each identified core to its expected location. The application emphasizes interactivity, allowing users to easily adjust segmentation and gridding results. Operating entirely in the web-browser, TMA-Grid eliminates the need for downloads or installations and ensures data privacy. Adhering to FAIR principles (Findable, Accessible, Interoperable, and Reusable), the application and its components are designed for seamless integration into TMA research workflows. Conclusions: TMA-Grid provides a robust, user-friendly solution for TMA dearraying on the web. As an open, freely accessible platform, it lays the foundation for collaborative analyses of TMAs and similar histopathology imaging data. Availability: Web application: https://episphere.github.io/tma-grid Code: https://github.com/episphere/tma-grid Tutorial: https://youtu.be/miajqyw4BVk △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: NA

arXiv:2407.15321 [pdf, other]

doi 10.1080/01431161.2024.2384098

Hierarchical Homogeneity-Based Superpixel Segmentation: Application to Hyperspectral Image Analysis

Authors: Luciano Carvalho Ayres, Sérgio José Melo de Almeida, José Carlos Moreira Bermudez, Ricardo Augusto Borsoi

Abstract: Hyperspectral image (HI) analysis approaches have recently become increasingly complex and sophisticated. Recently, the combination of spectral-spatial information and superpixel techniques have addressed some hyperspectral data issues, such as the higher spatial variability of spectral signatures and dimensionality of the data. However, most existing superpixel approaches do not account for speci… ▽ More Hyperspectral image (HI) analysis approaches have recently become increasingly complex and sophisticated. Recently, the combination of spectral-spatial information and superpixel techniques have addressed some hyperspectral data issues, such as the higher spatial variability of spectral signatures and dimensionality of the data. However, most existing superpixel approaches do not account for specific HI characteristics resulting from its high spectral dimension. In this work, we propose a multiscale superpixel method that is computationally efficient for processing hyperspectral data. The Simple Linear Iterative Clustering (SLIC) oversegmentation algorithm, on which the technique is based, has been extended hierarchically. Using a novel robust homogeneity testing, the proposed hierarchical approach leads to superpixels of variable sizes but with higher spectral homogeneity when compared to the classical SLIC segmentation. For validation, the proposed homogeneity-based hierarchical method was applied as a preprocessing step in the spectral unmixing and classification tasks carried out using, respectively, the Multiscale sparse Unmixing Algorithm (MUA) and the CNN-Enhanced Graph Convolutional Network (CEGCN) methods. Simulation results with both synthetic and real data show that the technique is competitive with state-of-the-art solutions. △ Less

Submitted 21 July, 2024; originally announced July 2024.

arXiv:2407.09355 [pdf]

FastImpute: A Baseline for Open-source, Reference-Free Genotype Imputation Methods -- A Case Study in PRS313

Authors: Aaron Ge, Jeya Balasubramanian, Xueyao Wu, Peter Kraft, Jonas S. Almeida

Abstract: Genotype imputation enhances genetic data by predicting missing SNPs using reference haplotype information. Traditional methods leverage linkage disequilibrium (LD) to infer untyped SNP genotypes, relying on the similarity of LD structures between genotyped target sets and fully sequenced reference panels. Recently, reference-free deep learning-based methods have emerged, offering a promising alte… ▽ More Genotype imputation enhances genetic data by predicting missing SNPs using reference haplotype information. Traditional methods leverage linkage disequilibrium (LD) to infer untyped SNP genotypes, relying on the similarity of LD structures between genotyped target sets and fully sequenced reference panels. Recently, reference-free deep learning-based methods have emerged, offering a promising alternative by predicting missing genotypes without external databases, thereby enhancing privacy and accessibility. However, these methods often produce models with tens of millions of parameters, leading to challenges such as the need for substantial computational resources to train and inefficiency for client-sided deployment. Our study addresses these limitations by introducing a baseline for a novel genotype imputation pipeline that supports client-sided imputation models generalizable across any genotyping chip and genomic region. This approach enhances patient privacy by performing imputation directly on edge devices. As a case study, we focus on PRS313, a polygenic risk score comprising 313 SNPs used for breast cancer risk prediction. Utilizing consumer genetic panels such as 23andMe, our model democratizes access to personalized genetic insights by allowing 23andMe users to obtain their PRS313 score. We demonstrate that simple linear regression can significantly improve the accuracy of PRS313 scores when calculated using SNPs imputed from consumer gene panels, such as 23andMe. Our linear regression model achieved an R^2 of 0.86, compared to 0.33 without imputation and 0.28 with simple imputation (substituting missing SNPs with the minor allele frequency). These findings suggest that popular SNP analysis libraries could benefit from integrating linear regression models for genotype imputation, providing a viable and light-weight alternative to reference based imputation. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: This paper is 16 pages long and contains 7 figures. For more information and to access related resources: * Web application: https://aaronge-2020.github.io/DeepImpute/ * Code repository: https://github.com/aaronge-2020/DeepImpute

arXiv:2407.07159 [pdf, other]

Finding Fake News Websites in the Wild

Authors: Leandro Araujo, Joao M. M. Couto, Luiz Felipe Nery, Isadora C. Rodrigues, Jussara M. Almeida, Julio C. S. Reis, Fabricio Benevenuto

Abstract: The battle against the spread of misinformation on the Internet is a daunting task faced by modern society. Fake news content is primarily distributed through digital platforms, with websites dedicated to producing and disseminating such content playing a pivotal role in this complex ecosystem. Therefore, these websites are of great interest to misinformation researchers. However, obtaining a comp… ▽ More The battle against the spread of misinformation on the Internet is a daunting task faced by modern society. Fake news content is primarily distributed through digital platforms, with websites dedicated to producing and disseminating such content playing a pivotal role in this complex ecosystem. Therefore, these websites are of great interest to misinformation researchers. However, obtaining a comprehensive list of websites labeled as producers and/or spreaders of misinformation can be challenging, particularly in developing countries. In this study, we propose a novel methodology for identifying websites responsible for creating and disseminating misinformation content, which are closely linked to users who share confirmed instances of fake news on social media. We validate our approach on Twitter by examining various execution modes and contexts. Our findings demonstrate the effectiveness of the proposed methodology in identifying misinformation websites, which can aid in gaining a better understanding of this phenomenon and enabling competent entities to tackle the problem in various areas of society. △ Less

Submitted 15 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

Comments: This is a preprint version of a submitted manuscript on the Brazilian Symposium on Multimedia and the Web (WebMedia)

arXiv:2407.01375 [pdf, other]

TransferAttn: Transferable-guided Attention Is All You Need for Video Domain Adaptation

Authors: André Sacilotti, Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida

Abstract: Unsupervised domain adaptation (UDA) in videos is a challenging task that remains not well explored compared to image-based UDA techniques. Although vision transformers (ViT) achieve state-of-the-art performance in many computer vision tasks, their use in video domain adaptation has still been little explored. Our key idea is to use the transformer layers as a feature encoder and incorporate spati… ▽ More Unsupervised domain adaptation (UDA) in videos is a challenging task that remains not well explored compared to image-based UDA techniques. Although vision transformers (ViT) achieve state-of-the-art performance in many computer vision tasks, their use in video domain adaptation has still been little explored. Our key idea is to use the transformer layers as a feature encoder and incorporate spatial and temporal transferability relationships into the attention mechanism. A Transferable-guided Attention (TransferAttn) framework is then developed to exploit the capacity of the transformer to adapt cross-domain knowledge from different backbones. To improve the transferability of ViT, we introduce a novel and effective module named Domain Transferable-guided Attention Block~(DTAB). DTAB compels ViT to focus on the spatio-temporal transferability relationship among video frames by changing the self-attention mechanism to a transferability attention mechanism. Extensive experiments on UCF-HMDB, Kinetics-Gameplay, and Kinetics-NEC Drone datasets with different backbones, like ResNet101, I3D, and STAM, verify the effectiveness of TransferAttn compared with state-of-the-art approaches. Also, we demonstrate that DTAB yields performance gains when applied to other state-of-the-art transformer-based UDA methods from both video and image domains. The code will be made freely available. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2404.17535 [pdf, other]

Using Neural Implicit Flow To Represent Latent Dynamics Of Canonical Systems

Authors: Imran Nasim, Joaõ Lucas de Sousa Almeida

Abstract: The recently introduced class of architectures known as Neural Operators has emerged as highly versatile tools applicable to a wide range of tasks in the field of Scientific Machine Learning (SciML), including data representation and forecasting. In this study, we investigate the capabilities of Neural Implicit Flow (NIF), a recently developed mesh-agnostic neural operator, for representing the la… ▽ More The recently introduced class of architectures known as Neural Operators has emerged as highly versatile tools applicable to a wide range of tasks in the field of Scientific Machine Learning (SciML), including data representation and forecasting. In this study, we investigate the capabilities of Neural Implicit Flow (NIF), a recently developed mesh-agnostic neural operator, for representing the latent dynamics of canonical systems such as the Kuramoto-Sivashinsky (KS), forced Korteweg-de Vries (fKdV), and Sine-Gordon (SG) equations, as well as for extracting dynamically relevant information from them. Finally we assess the applicability of NIF as a dimensionality reduction algorithm and conduct a comparative analysis with another widely recognized family of neural operators, known as Deep Operator Networks (DeepONets). △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: Accepted into the International conference on Scientific Computation and Machine Learning 2024 (SCML 2024)

arXiv:2404.05897 [pdf, other]

ClusterRadar: an Interactive Web-Tool for the Multi-Method Exploration of Spatial Clusters Over Time

Authors: Lee Mason, Blánaid Hicks, Jonas S. Almeida

Abstract: Spatial cluster analysis, the detection of localized patterns of similarity in geospatial data, has a wide-range of applications for scientific discovery and practical decision making. One way to detect spatial clusters is by using local indicators of spatial association, such as Local Moran's I or Getis-Ord Gi*. However, different indicators tend to produce substantially different results due to… ▽ More Spatial cluster analysis, the detection of localized patterns of similarity in geospatial data, has a wide-range of applications for scientific discovery and practical decision making. One way to detect spatial clusters is by using local indicators of spatial association, such as Local Moran's I or Getis-Ord Gi*. However, different indicators tend to produce substantially different results due to their distinct operational characteristics. Choosing a suitable method or comparing results from multiple methods is a complex task. Furthermore, spatial clusters are dynamic and it is often useful to track their evolution over time, which adds an additional layer of complexity. ClusterRadar is a web-tool designed to address these analytical challenges. The tool allows users to easily perform spatial clustering and analyze the results in an interactive environment, uniquely prioritizing temporal analysis and the comparison of multiple methods. The tool's interactive dashboard presents several visualizations, each offering a distinct perspective of the temporal and methodological aspects of the spatial clustering results. ClusterRadar has several features designed to maximize its utility to a broad user-base, including support for various geospatial formats, and a fully in-browser execution environment to preserve the privacy of sensitive data. Feedback from a varied set of researchers suggests ClusterRadar's potential for enhancing the temporal analysis of spatial clusters. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Submitted to IEEE Vis 2024

arXiv:2404.01446 [pdf, other]

Finding Regions of Interest in Whole Slide Images Using Multiple Instance Learning

Authors: Martim Afonso, Praphulla M. S. Bhawsar, Monjoy Saha, Jonas S. Almeida, Arlindo L. Oliveira

Abstract: Whole Slide Images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to AI-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level,… ▽ More Whole Slide Images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to AI-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: a) accurately predicting the overall cancer phenotype and b) finding out what cellular morphologies are associated with it at the tile level. To address these challenges, a weakly supervised Multiple Instance Learning (MIL) approach was explored for two prevalent cancer types, Invasive Breast Carcinoma (TCGA-BRCA) and Lung Squamous Cell Carcinoma (TCGA-LUSC). This approach was explored for tumor detection at low magnification levels and TP53 mutations at various levels. Our results show that a novel additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by Attention MIL (AUC 0.97). More interestingly from the perspective of the molecular pathologist, these different AI architectures identify distinct sensitivities to morphological features (through the detection of Regions of Interest, RoI) at different amplification levels. Tellingly, TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved. △ Less

Submitted 11 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

arXiv:2401.13630 [pdf, other]

Enabling Seamless Data Security, Consensus, and Trading in Vehicular Networks

Authors: Emanuel Vieira, João Almeida, Joaquim Ferreira, Paulo C. Bartolomeu

Abstract: Cooperative driving is an emerging paradigm to enhance the safety and efficiency of autonomous vehicles. To ensure successful cooperation, road users must reach a consensus for making collective decisions, while recording vehicular data to analyze and address failures related to such agreements. This data has the potential to provide valuable insights into various vehicular events, while also pote… ▽ More Cooperative driving is an emerging paradigm to enhance the safety and efficiency of autonomous vehicles. To ensure successful cooperation, road users must reach a consensus for making collective decisions, while recording vehicular data to analyze and address failures related to such agreements. This data has the potential to provide valuable insights into various vehicular events, while also potentially improving accountability measures. Furthermore, vehicles may benefit from the ability to negotiate and trade services among themselves, adding value to the cooperative driving framework. However, the majority of proposed systems aiming to ensure data security, consensus, or service trading, lack efficient and thoroughly validated mechanisms that consider the distinctive characteristics of vehicular networks. These limitations are amplified by a dependency on the centralized support provided by the infrastructure. Furthermore, corresponding mechanisms must diligently address security concerns, especially regarding potential malicious or misbehaving nodes, while also considering inherent constraints of the wireless medium. We introduce the Verifiable Event Extension (VEE), an applicational extension designed for Intelligent Transportation System (ITS) messages. The VEE operates seamlessly with any existing standardized vehicular communications protocol, addressing crucial aspects of data security, consensus, and trading with minimal overhead. To achieve this, we employ blockchain techniques, Byzantine fault tolerance (BFT) consensus protocols, and cryptocurrency-based mechanics. To assess our proposal's feasibility and lightweight nature, we employed a hardware-in-the-loop setup for analysis. Experimental results demonstrate the viability and efficiency of the VEE extension in overcoming the challenges posed by the distributed and opportunistic nature of wireless vehicular communications. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2401.13161 [pdf, other]

doi 10.1109/LGRS.2024.3358694

A Generalized Multiscale Bundle-Based Hyperspectral Sparse Unmixing Algorithm

Authors: Luciano Carvalho Ayres, Ricardo Augusto Borsoi, José Carlos Moreira Bermudez, Sérgio José Melo de Almeida

Abstract: In hyperspectral sparse unmixing, a successful approach employs spectral bundles to address the variability of the endmembers in the spatial domain. However, the regularization penalties usually employed aggregate substantial computational complexity, and the solutions are very noise-sensitive. We generalize a multiscale spatial regularization approach to solve the unmixing problem by incorporatin… ▽ More In hyperspectral sparse unmixing, a successful approach employs spectral bundles to address the variability of the endmembers in the spatial domain. However, the regularization penalties usually employed aggregate substantial computational complexity, and the solutions are very noise-sensitive. We generalize a multiscale spatial regularization approach to solve the unmixing problem by incorporating group sparsity-inducing mixed norms. Then, we propose a noise-robust method that can take advantage of the bundle structure to deal with endmember variability while ensuring inter- and intra-class sparsity in abundance estimation with reasonable computational cost. We also present a general heuristic to select the \emph{most representative} abundance estimation over multiple runs of the unmixing process, yielding a solution that is robust and highly reproducible. Experiments illustrate the robustness and consistency of the results when compared to related methods. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2312.13784 [pdf, other]

Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks

Authors: Giordano Paoletti, Luca Gioacchini, Marco Mellia, Luca Vassio, Jussara M. Almeida

Abstract: In dynamic complex networks, entities interact and form network communities that evolve over time. Among the many static Community Detection (CD) solutions, the modularity-based Louvain, or Greedy Modularity Algorithm (GMA), is widely employed in real-world applications due to its intuitiveness and scalability. Nevertheless, addressing CD in dynamic graphs remains an open problem, since the evolut… ▽ More In dynamic complex networks, entities interact and form network communities that evolve over time. Among the many static Community Detection (CD) solutions, the modularity-based Louvain, or Greedy Modularity Algorithm (GMA), is widely employed in real-world applications due to its intuitiveness and scalability. Nevertheless, addressing CD in dynamic graphs remains an open problem, since the evolution of the network connections may poison the identification of communities, which may be evolving at a slower pace. Hence, naively applying GMA to successive network snapshots may lead to temporal inconsistencies in the communities. Two evolutionary adaptations of GMA, sGMA and $α$GMA, have been proposed to tackle this problem. Yet, evaluating the performance of these methods and understanding to which scenarios each one is better suited is challenging because of the lack of a comprehensive set of metrics and a consistent ground truth. To address these challenges, we propose (i) a benchmarking framework for evolutionary CD algorithms in dynamic networks and (ii) a generalised modularity-based approach (NeGMA). Our framework allows us to generate synthetic community-structured graphs and design evolving scenarios with nine basic graph transformations occurring at different rates. We evaluate performance through three metrics we define, i.e. Correctness, Delay, and Stability. Our findings reveal that $α$GMA is well-suited for detecting intermittent transformations, but struggles with abrupt changes; sGMA achieves superior stability, but fails to detect emerging communities; and NeGMA appears a well-balanced solution, excelling in responsiveness and instantaneous transformations detection. △ Less

Submitted 11 January, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: Accepted at the 4th Workshop on Graphs and more Complex structures for Learning and Reasoning (GCLR) at AAAI 2024

Journal ref: 4th Workshop on Graphs and more Complex structures for Learning and Reasoning (GCLR) at AAAI 2024

arXiv:2312.12880 [pdf, other]

Testing the Segment Anything Model on radiology data

Authors: José Guilherme de Almeida, Nuno M. Rodrigues, Sara Silva, Nickolas Papanikolaou

Abstract: Deep learning models trained with large amounts of data have become a recent and effective approach to predictive problem solving -- these have become known as "foundation models" as they can be used as fundamental tools for other applications. While the paramount examples of image classification (earlier) and large language models (more recently) led the way, the Segment Anything Model (SAM) was… ▽ More Deep learning models trained with large amounts of data have become a recent and effective approach to predictive problem solving -- these have become known as "foundation models" as they can be used as fundamental tools for other applications. While the paramount examples of image classification (earlier) and large language models (more recently) led the way, the Segment Anything Model (SAM) was recently proposed and stands as the first foundation model for image segmentation, trained on over 10 million images and with recourse to over 1 billion masks. However, the question remains -- what are the limits of this foundation? Given that magnetic resonance imaging (MRI) stands as an important method of diagnosis, we sought to understand whether SAM could be used for a few tasks of zero-shot segmentation using MRI data. Particularly, we wanted to know if selecting masks from the pool of SAM predictions could lead to good segmentations. Here, we provide a critical assessment of the performance of SAM on magnetic resonance imaging data. We show that, while acceptable in a very limited set of cases, the overall trend implies that these models are insufficient for MRI segmentation across the whole volume, but can provide good segmentations in a few, specific slices. More importantly, we note that while foundation models trained on natural images are set to become key aspects of predictive modelling, they may prove ineffective when used on other imaging modalities. △ Less

Submitted 16 May, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

arXiv:2309.11464 [pdf, other]

Budget-Aware Pruning: Handling Multiple Domains with Less Parameters

Authors: Samuel Felipe dos Santos, Rodrigo Berriel, Thiago Oliveira-Santos, Nicu Sebe, Jurandy Almeida

Abstract: Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single do… ▽ More Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single domain or task, requiring them to learn and store new parameters for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by learning a single model capable of performing well in multiple domains. Nevertheless, the models are usually larger than the baseline for a single domain. This work tackles both of these problems: our objective is to prune models capable of handling multiple domains according to a user-defined budget, making them more computationally affordable while keeping a similar classification performance. We achieve this by encouraging all domains to use a similar subset of filters from the baseline model, up to the amount defined by the user's budget. Then, filters that are not used by any domain are pruned from the network. The proposed approach innovates by better adapting to resource-limited devices while being one of the few works that handles multiple domains at test time with fewer parameters and lower computational complexity than the baseline model for a single domain. △ Less

Submitted 3 July, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2210.08101

arXiv:2309.11417

CNNs for JPEGs: A Study in Computational Cost

Authors: Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida

Abstract: Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, defining state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from the RGB pixels. However, most image data are usually available in compressed format, from which the JPEG is the most widely used due to transmission and storage purpose… ▽ More Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, defining state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from the RGB pixels. However, most image data are usually available in compressed format, from which the JPEG is the most widely used due to transmission and storage purposes demanding a preliminary decoding process that have a high computational load and memory usage. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. Those methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNNs architectures to work with them. One limitation of these current works is that, in order to accommodate the frequency domain data, the modifications made to the original model increase significantly their amount of parameters and computational complexity. On one hand, the methods have faster preprocessing, since the cost of fully decoding the images is avoided, but on the other hand, the cost of passing the images though the model is increased, mitigating the possible upside of accelerating the method. In this paper, we propose a further study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing the images through the network. We also propose handcrafted and data-driven techniques for reducing the computational complexity and the number of parameters for these models in order to keep them similar to their RGB baselines, leading to efficient models with a better trade off between computational cost and accuracy. △ Less

Submitted 22 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: A previous version of this work had already been submitted to ArXiv and is available at arXiv:2012.14426. Instead of maintaining two different submissions, we decided to submit a replacement for the previous submission

arXiv:2309.08964 [pdf, other]

Tightening Classification Boundaries in Open Set Domain Adaptation through Unknown Exploitation

Authors: Lucas Fernando Alvarenga e Silva, Nicu Sebe, Jurandy Almeida

Abstract: Convolutional Neural Networks (CNNs) have brought revolutionary advances to many research areas due to their capacity of learning from raw data. However, when those methods are applied to non-controllable environments, many different factors can degrade the model's expected performance, such as unlabeled datasets with different levels of domain shift and category shift. Particularly, when both iss… ▽ More Convolutional Neural Networks (CNNs) have brought revolutionary advances to many research areas due to their capacity of learning from raw data. However, when those methods are applied to non-controllable environments, many different factors can degrade the model's expected performance, such as unlabeled datasets with different levels of domain shift and category shift. Particularly, when both issues occur at the same time, we tackle this challenging setup as Open Set Domain Adaptation (OSDA) problem. In general, existing OSDA approaches focus their efforts only on aligning known classes or, if they already extract possible negative instances, use them as a new category learned with supervision during the course of training. We propose a novel way to improve OSDA approaches by extracting a high-confidence set of unknown instances and using it as a hard constraint to tighten the classification boundaries of OSDA methods. Especially, we adopt a new loss constraint evaluated in three different means, (1) directly with the pristine negative instances; (2) with randomly transformed negatives using data augmentation techniques; and (3) with synthetically generated negatives containing adversarial features. We assessed all approaches in an extensive set of experiments based on OVANet, where we could observe consistent improvements for two public benchmarks, the Office-31 and Office-Home datasets, yielding absolute gains of up to 1.3% for both Accuracy and H-Score on Office-31 and 5.8% for Accuracy and 4.7% for H-Score on Office-Home. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Journal ref: 36th SIBGRAPI - Conference on Graphics, Patterns, and Images (SIBGRAPI'23), 2023, pp. 1-6

arXiv:2308.14782 [pdf, other]

Helping Fact-Checkers Identify Fake News Stories Shared through Images on WhatsApp

Authors: Julio C. S. Reis, Philipe Melo, Fabiano Belém, Fabricio Murai, Jussara M. Almeida, Fabricio Benevenuto

Abstract: WhatsApp has introduced a novel avenue for smartphone users to engage with and disseminate news stories. The convenience of forming interest-based groups and seamlessly sharing content has rendered WhatsApp susceptible to the exploitation of misinformation campaigns. While the process of fact-checking remains a potent tool in identifying fabricated news, its efficacy falters in the face of the unp… ▽ More WhatsApp has introduced a novel avenue for smartphone users to engage with and disseminate news stories. The convenience of forming interest-based groups and seamlessly sharing content has rendered WhatsApp susceptible to the exploitation of misinformation campaigns. While the process of fact-checking remains a potent tool in identifying fabricated news, its efficacy falters in the face of the unprecedented deluge of information generated on the Internet today. In this work, we explore automatic ranking-based strategies to propose a "fakeness score" model as a means to help fact-checking agencies identify fake news stories shared through images on WhatsApp. Based on the results, we design a tool and integrate it into a real system that has been used extensively for monitoring content during the 2018 Brazilian general election. Our experimental evaluation shows that this tool can reduce by up to 40% the amount of effort required to identify 80% of the fake news in the data when compared to current mechanisms practiced by the fact-checking agencies for the selection of news stories to be checked. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: This is a preprint version of an accepted manuscript on the Brazilian Symposium on Multimedia and the Web (WebMedia). Please, consider to cite it instead of this one

arXiv:2308.05864 [pdf, other]

doi 10.1038/s41592-024-02233-6

The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

Authors: Jun Ma, Ronald Xie, Shamini Ayyadhury, Cheng Ge, Anubha Gupta, Ritu Gupta, Song Gu, Yao Zhang, Gihun Lee, Joonkee Kim, Wei Lou, Haofeng Li, Eric Upschulte, Timo Dickscheid, José Guilherme de Almeida, Yixin Wang, Lin Han, Xin Yang, Marco Labagnara, Vojislav Gligorovski, Maxime Scheder, Sahand Jamal Rahi, Carly Kempster, Alice Pollitt, Leon Espinosa , et al. (15 additional authors not shown)

Abstract: Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diver… ▽ More Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deep-learning algorithm that not only exceeds existing methods but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging. △ Less

Submitted 1 April, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: NeurIPS22 Cell Segmentation Challenge: https://neurips22-cellseg.grand-challenge.org/ . Nature Methods (2024)

arXiv:2307.05717 [pdf, other]

Towards Mobility Data Science (Vision Paper)

Authors: Mohamed Mokbel, Mahmoud Sakr, Li Xiong, Andreas Züfle, Jussara Almeida, Taylor Anderson, Walid Aref, Gennady Andrienko, Natalia Andrienko, Yang Cao, Sanjay Chawla, Reynold Cheng, Panos Chrysanthis, Xiqi Fei, Gabriel Ghinita, Anita Graser, Dimitrios Gunopulos, Christian Jensen, Joon-Seok Kim, Kyoung-Sook Kim, Peer Kröger, John Krumm, Johannes Lauer, Amr Magdy, Mario Nascimento , et al. (23 additional authors not shown)

Abstract: Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences… ▽ More Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the emerging domain of mobility data science. Towards a unified approach to mobility data science, we envision a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art and describe open challenges for the research community in the coming years. △ Less

Submitted 7 March, 2024; v1 submitted 21 June, 2023; originally announced July 2023.

Comments: Updated to reflect the major revision for ACM Transactions on Spatial Algorithms and Systems (TSAS). This version reflects the final version accepted by ACM TSAS

arXiv:2307.02631 [pdf, other]

doi 10.3389/frai.2024.1343447

An explainable model to support the decision about the therapy protocol for AML

Authors: Jade M. Almeida, Giovanna A. Castro, João A. Machado-Neto, Tiago A. Almeida

Abstract: Acute Myeloid Leukemia (AML) is one of the most aggressive types of hematological neoplasm. To support the specialists' decision about the appropriate therapy, patients with AML receive a prognostic of outcomes according to their cytogenetic and molecular characteristics, often divided into three risk categories: favorable, intermediate, and adverse. However, the current risk classification has kn… ▽ More Acute Myeloid Leukemia (AML) is one of the most aggressive types of hematological neoplasm. To support the specialists' decision about the appropriate therapy, patients with AML receive a prognostic of outcomes according to their cytogenetic and molecular characteristics, often divided into three risk categories: favorable, intermediate, and adverse. However, the current risk classification has known problems, such as the heterogeneity between patients of the same risk group and no clear definition of the intermediate risk category. Moreover, as most patients with AML receive an intermediate-risk classification, specialists often demand other tests and analyses, leading to delayed treatment and worsening of the patient's clinical condition. This paper presents the data analysis and an explainable machine-learning model to support the decision about the most appropriate therapy protocol according to the patient's survival prediction. In addition to the prediction model being explainable, the results obtained are promising and indicate that it is possible to use it to support the specialists' decisions safely. Most importantly, the findings offered in this study have the potential to open new avenues of research toward better treatments and prognostic markers. △ Less

Submitted 15 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: Preprint of the paper accepted to be published in the Proc. of the 12th Brazilian Conference on Intelligent Systems (BRACIS'2023)

arXiv:2306.15740 [pdf, other]

Impact of User Privacy and Mobility on Edge Offloading

Authors: João Paulo Esper, Nadjib Achir, Kleber Vieira Cardoso, Jussara M. Almeida

Abstract: Offloading high-demanding applications to the edge provides better quality of experience (QoE) for users with limited hardware devices. However, to maintain a competitive QoE, infrastructure, and service providers must adapt to users' different mobility patterns, which can be challenging, especially for location-based services (LBS). Another issue that needs to be tackled is the increasing demand… ▽ More Offloading high-demanding applications to the edge provides better quality of experience (QoE) for users with limited hardware devices. However, to maintain a competitive QoE, infrastructure, and service providers must adapt to users' different mobility patterns, which can be challenging, especially for location-based services (LBS). Another issue that needs to be tackled is the increasing demand for user privacy protection. With less (accurate) information regarding user location, preferences, and usage patterns, forecasting the performance of offloading mechanisms becomes even more challenging. This work discusses the impacts of users' privacy and mobility when offloading to the edge. Different privacy and mobility scenarios are simulated and discussed to shed light on the trade-offs (e.g., privacy protection at the cost of increased latency) among privacy protection, mobility, and offloading performance. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: 2023 Annual IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications (IEEE PIMRC 2023)

arXiv:2306.13116 [pdf, other]

A Machine Learning Pressure Emulator for Hydrogen Embrittlement

Authors: Minh Triet Chau, João Lucas de Sousa Almeida, Elie Alhajjar, Alberto Costa Nogueira Junior

Abstract: A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fid… ▽ More A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fidelity results, the current PDE-based simulators are time- and computationally-demanding. Using simulation data, we train an ML model to predict the pressure on the pipelines' inner walls, which is a first step for pipeline system surveillance. We found that the physics-based method outperformed the purely data-driven method and satisfy the physical constraints of the gas flow system. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2305.11990 [pdf, other]

doi 10.1109/lgrs.2023.3296064

Productive Crop Field Detection: A New Dataset and Deep Learning Benchmark Results

Authors: Eduardo Nascimento, John Just, Jurandy Almeida, Tiago Almeida

Abstract: In precision agriculture, detecting productive crop fields is an essential practice that allows the farmer to evaluate operating performance separately and compare different seed varieties, pesticides, and fertilizers. However, manually identifying productive fields is often a time-consuming and error-prone task. Previous studies explore different methods to detect crop fields using advanced machi… ▽ More In precision agriculture, detecting productive crop fields is an essential practice that allows the farmer to evaluate operating performance separately and compare different seed varieties, pesticides, and fertilizers. However, manually identifying productive fields is often a time-consuming and error-prone task. Previous studies explore different methods to detect crop fields using advanced machine learning algorithms, but they often lack good quality labeled data. In this context, we propose a high-quality dataset generated by machine operation combined with Sentinel-2 images tracked over time. As far as we know, it is the first one to overcome the lack of labeled samples by using this technique. In sequence, we apply a semi-supervised classification of unlabeled data and state-of-the-art supervised and self-supervised deep learning methods to detect productive crop fields automatically. Finally, the results demonstrate high accuracy in Positive Unlabeled learning, which perfectly fits the problem where we have high confidence in the positive samples. Best performances have been found in Triplet Loss Siamese given the existence of an accurate dataset and Contrastive Learning considering situations where we do not have a comprehensive labeled dataset available. △ Less

Submitted 25 July, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: Preprint of the paper https://doi.org/10.1109/lgrs.2023.3296064 published in IEEE Geoscience and Remote Sensing Letters

arXiv:2304.10612 [pdf]

Halcyon -- A Pathology Imaging and Feature analysis and Management System

Authors: Erich Bremer, Tammy DiPrima, Joseph Balsamo, Jonas Almeida, Rajarsi Gupta, Joel Saltz

Abstract: Halcyon is a new pathology imaging analysis and feature management system based on W3C linked-data open standards and is designed to scale to support the needs for the voluminous production of features from deep-learning feature pipelines. Halcyon can support multiple users with a web-based UX with access to all user data over a standards-based web API allowing for integration with other processes… ▽ More Halcyon is a new pathology imaging analysis and feature management system based on W3C linked-data open standards and is designed to scale to support the needs for the voluminous production of features from deep-learning feature pipelines. Halcyon can support multiple users with a web-based UX with access to all user data over a standards-based web API allowing for integration with other processes and software systems. Identity management and data security is also provided. △ Less

Submitted 7 April, 2023; originally announced April 2023.

Comments: 15 pages, 11 figures. arXiv admin note: text overlap with arXiv:2005.06469

arXiv:2211.17045 [pdf, other]

doi 10.1109/SSCI50451.2021.9660128

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Authors: Mateus Roder, Jurandy Almeida, Gustavo H. de Rosa, Leandro A. Passos, André L. D. Rossi, João P. Papa

Abstract: In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks… ▽ More In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process as training complex models over large datasets are expensive and time-consuming. Such a problem is even more evident when dealing with video analysis. Some works have considered transfer learning or domain adaptation, i.e., approaches that map the knowledge from one domain to another, to ease the training burden, yet most of them operate over individual or small blocks of frames. This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model, denoted as Spectral Deep Belief Network. Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process. The experimental results conducted over two public video dataset, the HMDB-51 and the UCF-101, depict the effectiveness of the proposed model and its reduced computational burden when compared to traditional energy-based models, such as Restricted Boltzmann Machines and Deep Belief Networks. △ Less

Submitted 30 November, 2022; originally announced November 2022.

arXiv:2211.15330 [pdf, other]

UAS in the Airspace: A Review on Integration, Simulation, Optimization, and Open Challenges

Authors: Euclides Carlos Pinto Neto, Derick Moreira Baum, Jorge Rady de Almeida Jr., Joao Batista Camargo Jr., Paulo Sergio Cugnasca

Abstract: Air transportation is essential for society, and it is increasing gradually due to its importance. To improve the airspace operation, new technologies are under development, such as Unmanned Aircraft Systems (UAS). In fact, in the past few years, there has been a growth in UAS numbers in segregated airspace. However, there is an interest in integrating these aircraft into the National Airspace Sys… ▽ More Air transportation is essential for society, and it is increasing gradually due to its importance. To improve the airspace operation, new technologies are under development, such as Unmanned Aircraft Systems (UAS). In fact, in the past few years, there has been a growth in UAS numbers in segregated airspace. However, there is an interest in integrating these aircraft into the National Airspace System (NAS). The UAS is vital to different industries due to its advantages brought to the airspace (e.g., efficiency). Conversely, the relationship between UAS and Air Traffic Control (ATC) needs to be well-defined due to the impacts on ATC capacity these aircraft may present. Throughout the years, this impact may be lower than it is nowadays because the current lack of familiarity in this relationship contributes to higher workload levels. Thereupon, the primary goal of this research is to present a comprehensive review of the advancements in the integration of UAS in the National Airspace System (NAS) from different perspectives. We consider the challenges regarding simulation, final approach, and optimization of problems related to the interoperability of such systems in the airspace. Finally, we identify several open challenges in the field based on the existing state-of-the-art proposals. △ Less

Submitted 24 November, 2022; originally announced November 2022.

arXiv:2210.08101 [pdf, other]

doi 10.1007/978-3-031-43153-1_40

Budget-Aware Pruning for Multi-Domain Learning

Authors: Samuel Felipe dos Santos, Rodrigo Berriel, Thiago Oliveira-Santos, Nicu Sebe, Jurandy Almeida

Abstract: Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single do… ▽ More Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single domain or task, requiring them to learn and store new parameters for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by learning a single model that is capable of performing well in multiple domains. Nevertheless, the models are usually larger than the baseline for a single domain. This work tackles both of these problems: our objective is to prune models capable of handling multiple domains according to a user defined budget, making them more computationally affordable while keeping a similar classification performance. We achieve this by encouraging all domains to use a similar subset of filters from the baseline model, up to the amount defined by the user's budget. Then, filters that are not used by any domain are pruned from the network. The proposed approach innovates by better adapting to resource-limited devices while, to our knowledge, being the only work that is capable of handling multiple domains at test time with fewer parameters and lower computational complexity than the baseline model for a single domain. △ Less

Submitted 16 September, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

Journal ref: 22nd International Conference on Image Analysis and Processing (ICIAP'23), 2023, pp. 477-489

arXiv:2207.01734 [pdf]

ImageBox3: No-Server Tile Serving to Traverse Whole Slide Images on the Web

Authors: Praphulla MS Bhawsar, Erich Bremer, Máire A Duggan, Stephen Chanock, Montserrat Garcia-Closas, Joel Saltz, Jonas S Almeida

Abstract: Whole slide imaging (WSI) has become the primary modality for digital pathology data. However, due to the size and high-resolution nature of these images, they are generally only accessed in smaller sections or tiles via specialized platforms, most of which require extensive setup and/or costly infrastructure. These platforms typically also need a copy of the images to be locally available to them… ▽ More Whole slide imaging (WSI) has become the primary modality for digital pathology data. However, due to the size and high-resolution nature of these images, they are generally only accessed in smaller sections or tiles via specialized platforms, most of which require extensive setup and/or costly infrastructure. These platforms typically also need a copy of the images to be locally available to them, potentially causing issues with data governance and provenance. To address these concerns, we developed ImageBox3, an in-browser tiling mechanism to enable zero-footprint traversal of remote WSI data. All computation is performed client-side without compromising user governance, operating public and private images alike as long as the storage service supports HTTP range requests (standard in Cloud storage and most web servers). ImageBox3 thus removes significant hurdles to WSI operation and effective collaboration, allowing for the sort of democratized analytical tools needed to establish participative, FAIR digital pathology data commons. Availability: code - https://github.com/episphere/imagebox3; fig1 (live) - https://episphere.github.io/imagebox3/demo/scriptTag ; fig2 (live) - https://episphere.github.io/imagebox3/demo/serviceWorker ; fig 3 (live) - https://observablehq.com/@prafulb/imagebox3-in-observable . △ Less

Submitted 5 July, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

Comments: 9 pages, 3 figures

arXiv:2206.01604 [pdf, other]

Non-Intrusive Reduced Models based on Operator Inference for Chaotic Systems

Authors: João Lucas de Sousa Almeida, Arthur Cancellieri Pires, Klaus Feine Vaz Cid, Alberto Costa Nogueira Junior

Abstract: This work explores the physics-driven machine learning technique Operator Inference (OpInf) for predicting the state of chaotic dynamical systems. OpInf provides a non-intrusive approach to infer approximations of polynomial operators in reduced space without having access to the full order operators appearing in discretized models. Datasets for the physics systems are generated using conventional… ▽ More This work explores the physics-driven machine learning technique Operator Inference (OpInf) for predicting the state of chaotic dynamical systems. OpInf provides a non-intrusive approach to infer approximations of polynomial operators in reduced space without having access to the full order operators appearing in discretized models. Datasets for the physics systems are generated using conventional numerical solvers and then projected to a low-dimensional space via Principal Component Analysis (PCA). In latent space, a least-squares problem is set to fit a quadratic polynomial operator, which is subsequently employed in a time-integration scheme in order to produce extrapolations in the same space. Once solved, the inverse PCA operation is applied to reconstruct the extrapolations in the original space. The quality of the OpInf predictions is assessed via the Normalized Root Mean Squared Error (NRMSE) metric from which the Valid Prediction Time (VPT) is computed. Numerical experiments considering the chaotic systems Lorenz 96 and the Kuramoto-Sivashinsky equation show promising forecasting capabilities of the OpInf reduced order models with VPT ranges that outperform state-of-the-art machine learning methods such as backpropagation and reservoir computing recurrent neural networks [1], as well as Markov neural operators [2]. △ Less

Submitted 21 September, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: 16 pages, 37 figures, accepted for publication in the IEEE-TAI-PIML

arXiv:2204.13572 [pdf, other]

Mixup-based Deep Metric Learning Approaches for Incomplete Supervision

Authors: Luiz H. Buris, Daniel C. G. Pedronette, Joao P. Papa, Jurandy Almeida, Gustavo Carneiro, Fabio A. Faria

Abstract: Deep learning architectures have achieved promising results in different areas (e.g., medicine, agriculture, and security). However, using those powerful techniques in many real applications becomes challenging due to the large labeled collections required during training. Several works have pursued solutions to overcome it by proposing strategies that can learn more for less, e.g., weakly and sem… ▽ More Deep learning architectures have achieved promising results in different areas (e.g., medicine, agriculture, and security). However, using those powerful techniques in many real applications becomes challenging due to the large labeled collections required during training. Several works have pursued solutions to overcome it by proposing strategies that can learn more for less, e.g., weakly and semi-supervised learning approaches. As these approaches do not usually address memorization and sensitivity to adversarial examples, this paper presents three deep metric learning approaches combined with Mixup for incomplete-supervision scenarios. We show that some state-of-the-art approaches in metric learning might not work well in such scenarios. Moreover, the proposed approaches outperform most of them in different datasets. △ Less

Submitted 27 August, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: 5 pages, 1 figure, accepted for presentation at the ICIP2022

arXiv:2111.06161 [pdf, other]

Understanding mobility in networks: A node embedding approach

Authors: Matheus F. C. Barros, Carlos H. G. Ferreira, Bruno Pereira dos Santos, Lourenço A. P. Júnior, Marco Mellia, Jussara M. Almeida

Abstract: Motivated by the growing number of mobile devices capable of connecting and exchanging messages, we propose a methodology aiming to model and analyze node mobility in networks. We note that many existing solutions in the literature rely on topological measurements calculated directly on the graph of node contacts, aiming to capture the notion of the node's importance in terms of connectivity and m… ▽ More Motivated by the growing number of mobile devices capable of connecting and exchanging messages, we propose a methodology aiming to model and analyze node mobility in networks. We note that many existing solutions in the literature rely on topological measurements calculated directly on the graph of node contacts, aiming to capture the notion of the node's importance in terms of connectivity and mobility patterns beneficial for prototyping, design, and deployment of mobile networks. However, each measure has its specificity and fails to generalize the node importance notions that ultimately change over time. Unlike previous approaches, our methodology is based on a node embedding method that models and unveils the nodes' importance in mobility and connectivity patterns while preserving their spatial and temporal characteristics. We focus on a case study based on a trace of group meetings. The results show that our methodology provides a rich representation for extracting different mobility and connectivity patterns, which can be helpful for various applications and services in mobile networks. △ Less

Submitted 11 November, 2021; originally announced November 2021.

arXiv:2109.10462 [pdf, other]

A Hierarchical Network-Oriented Analysis of User Participation in Misinformation Spread on WhatsApp

Authors: Gabriel Peres Nobre, Carlos H. G. Ferreira, Jussara M. Almeida

Abstract: WhatsApp emerged as a major communication platform in many countries in the recent years. Despite offering only one-to-one and small group conversations, WhatsApp has been shown to enable the formation of a rich underlying network, crossing the boundaries of existing groups, and with structural properties that favor information dissemination at large. Indeed, WhatsApp has reportedly been used as a… ▽ More WhatsApp emerged as a major communication platform in many countries in the recent years. Despite offering only one-to-one and small group conversations, WhatsApp has been shown to enable the formation of a rich underlying network, crossing the boundaries of existing groups, and with structural properties that favor information dissemination at large. Indeed, WhatsApp has reportedly been used as a forum of misinformation campaigns with significant social, political and economic consequences in several countries. In this article, we aim at complementing recent studies on misinformation spread on WhatsApp, mostly focused on content properties and propagation dynamics, by looking into the network that connects users sharing the same piece of content. Specifically, we present a hierarchical network-oriented characterization of the users engaged in misinformation spread by focusing on three perspectives: individuals, WhatsApp groups and user communities, i.e., groupings of users who, intentionally or not, share the same content disproportionately often. By analyzing sharing and network topological properties, our study offers valuable insights into how WhatsApp users leverage the underlying network connecting different groups to gain large reach in the spread of misinformation on the platform. △ Less

Submitted 21 September, 2021; originally announced September 2021.

Comments: Paper Accepted in Information Processing & Management, Elsevier

arXiv:2109.09152 [pdf, other]

doi 10.1016/j.osnem.2021.100155.

On the Dynamics of Political Discussions on Instagram: A Network Perspective

Authors: Carlos H. G. Ferreira, Fabricio Murai, Ana P. C. Silva, Jussara M. Almeida, Martino Trevisan, Luca Vassio, Marco Mellia, Idilio Drago

Abstract: Instagram has been increasingly used as a source of information especially among the youth. As a result, political figures now leverage the platform to spread opinions and political agenda. We here analyze online discussions on Instagram, notably in political topics, from a network perspective. Specifically, we investigate the emergence of communities of co-commenters, that is, groups of users who… ▽ More Instagram has been increasingly used as a source of information especially among the youth. As a result, political figures now leverage the platform to spread opinions and political agenda. We here analyze online discussions on Instagram, notably in political topics, from a network perspective. Specifically, we investigate the emergence of communities of co-commenters, that is, groups of users who often interact by commenting on the same posts and may be driving the ongoing online discussions. In particular, we are interested in salient co-interactions, i.e., interactions of co-commenters that occur more often than expected by chance and under independent behavior. Unlike casual and accidental co-interactions which normally happen in large volumes, salient co-interactions are key elements driving the online discussions and, ultimately, the information dissemination. We base our study on the analysis of 10 weeks of data centered around major elections in Brazil and Italy, following both politicians and other celebrities. We extract and characterize the communities of co-commenters in terms of topological structure, properties of the discussions carried out by community members, and how some community properties, notably community membership and topics, evolve over time. We show that communities discussing political topics tend to be more engaged in the debate by writing longer comments, using more emojis, hashtags and negative words than in other subjects. Also, communities built around political discussions tend to be more dynamic, although top commenters remain active and preserve community membership over time. Moreover, we observe a great diversity in discussed topics over time: whereas some topics attract attention only momentarily, others, centered around more fundamental political discussions, remain consistently active over time. △ Less

Submitted 13 September, 2022; v1 submitted 19 September, 2021; originally announced September 2021.

Journal ref: Online Social Networks and Media, Volume 25, 2021, ISSN 2468-6964

arXiv:2109.02693 [pdf, other]

doi 10.1109/SIBGRAPI54419.2021.00031

Improving Transferability of Domain Adaptation Networks Through Domain Alignment Layers

Authors: Lucas Fernando Alvarenga e Silva, Daniel Carlos Guimarães Pedronette, Fábio Augusto Faria, João Paulo Papa, Jurandy Almeida

Abstract: Deep learning (DL) has been the primary approach used in various computer vision tasks due to its relevant results achieved on many tasks. However, on real-world scenarios with partially or no labeled data, DL methods are also prone to the well-known domain shift problem. Multi-source unsupervised domain adaptation (MSDA) aims at learning a predictor for an unlabeled domain by assigning weak knowl… ▽ More Deep learning (DL) has been the primary approach used in various computer vision tasks due to its relevant results achieved on many tasks. However, on real-world scenarios with partially or no labeled data, DL methods are also prone to the well-known domain shift problem. Multi-source unsupervised domain adaptation (MSDA) aims at learning a predictor for an unlabeled domain by assigning weak knowledge from a bag of source models. However, most works conduct domain adaptation leveraging only the extracted features and reducing their domain shift from the perspective of loss function designs. In this paper, we argue that it is not sufficient to handle domain shift only based on domain-level features, but it is also essential to align such information on the feature space. Unlike previous works, we focus on the network design and propose to embed Multi-Source version of DomaIn Alignment Layers (MS-DIAL) at different levels of the predictor. These layers are designed to match the feature distributions between different domains and can be easily applied to various MSDA methods. To show the robustness of our approach, we conducted an extensive experimental evaluation considering two challenging scenarios: digit recognition and object classification. The experimental results indicated that our approach can improve state-of-the-art MSDA methods, yielding relative gains of up to +30.64% on their classification accuracies. △ Less

Submitted 24 August, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

Journal ref: in 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2021, pp. 168-175

arXiv:2108.12214 [pdf, other]

Machine Learning for Performance Prediction of Spark Cloud Applications

Authors: Alexandre Maros, Fabricio Murai, Ana Paula Couto da Silva, Jussara M. Almeida, Marco Lattuada, Eugenio Gianniti, Marjan Hosseini, Danilo Ardagna

Abstract: Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the und… ▽ More Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the underlying resources at runtime. Machine Learning (ML), providing black box solutions to model the relationship between application performance and system configuration without requiring in-detail knowledge of the system, has become a popular way of predicting the performance of big data applications. We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of today's most widely used frameworks for big data analysis. We compare our approach with \textit{Ernest} (an ML-based technique proposed in the literature by the Spark inventors) on a range of scenarios, application workloads, and cloud system configurations. Our experiments show that Ernest can accurately estimate the performance of very regular applications, but it fails when applications exhibit more irregular patterns and/or when extrapolating on bigger data set sizes. Results show that our models match or exceed Ernest's performance, sometimes enabling us to reduce the prediction error from 126-187% to only 5-19%. △ Less

Submitted 27 August, 2021; originally announced August 2021.

Comments: Published in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD)

ACM Class: B.8.2; I.2

arXiv:2107.08492 [pdf, ps, other]

SENSORIMOTOR GRAPH: Action-Conditioned Graph Neural Network for Learning Robotic Soft Hand Dynamics

Authors: João Damião Almeida, Paul Schydlo, Atabak Dehban, José Santos-Victor

Abstract: Soft robotics is a thriving branch of robotics which takes inspiration from nature and uses affordable flexible materials to design adaptable non-rigid robots. However, their flexible behavior makes these robots hard to model, which is essential for a precise actuation and for optimal control. For system modelling, learning-based approaches have demonstrated good results, yet they fail to consider… ▽ More Soft robotics is a thriving branch of robotics which takes inspiration from nature and uses affordable flexible materials to design adaptable non-rigid robots. However, their flexible behavior makes these robots hard to model, which is essential for a precise actuation and for optimal control. For system modelling, learning-based approaches have demonstrated good results, yet they fail to consider the physical structure underlying the system as an inductive prior. In this work, we take inspiration from sensorimotor learning, and apply a Graph Neural Network to the problem of modelling a non-rigid kinematic chain (i.e. a robotic soft hand) taking advantage of two key properties: 1) the system is compositional, that is, it is composed of simple interacting parts connected by edges, 2) it is order invariant, i.e. only the structure of the system is relevant for predicting future trajectories. We denote our model as the 'Sensorimotor Graph' since it learns the system connectivity from observation and uses it for dynamics prediction. We validate our model in different scenarios and show that it outperforms the non-structured baselines in dynamics prediction while being more robust to configurational variations, tracking errors or node failures. △ Less

Submitted 18 July, 2021; originally announced July 2021.

Comments: Accepted at 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

arXiv:2106.03626 [pdf, ps, other]

Towards Formal Verification of Password Generation Algorithms used in Password Managers

Authors: Miguel Grilo, João F. Ferreira, José Bacelar Almeida

Abstract: Password managers are important tools that enable us to use stronger passwords, freeing us from the cognitive burden of remembering them. Despite this, there are still many users who do not fully trust password managers. In this paper, we focus on a feature that most password managers offer that might impact the user's trust, which is the process of generating a random password. We survey which al… ▽ More Password managers are important tools that enable us to use stronger passwords, freeing us from the cognitive burden of remembering them. Despite this, there are still many users who do not fully trust password managers. In this paper, we focus on a feature that most password managers offer that might impact the user's trust, which is the process of generating a random password. We survey which algorithms are most commonly used and we propose a solution for a formally verified reference implementation of a password generation algorithm. We use EasyCrypt as our framework to both specify the reference implementation and to prove its functional correctness and security. △ Less

Submitted 21 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: shortpaper

arXiv:2104.08584 [pdf, other]

Deep Learning in Beyond 5G Networks with Image-based Time-Series Representation

Authors: Lucas Fernando Alvarenga e Silva, Bruno Yuji Lino Kimura, Jurandy Almeida

Abstract: Towards the network innovation, the Beyond Five-Generation (B5G) networks envision the use of machine learning (ML) methods to predict the network conditions and performance indicators in order to best make decisions and allocate resources. In this paper, we propose a new ML approach to accomplish predictions in B5G networks. Instead of handling the time-series in the network domain of values, we… ▽ More Towards the network innovation, the Beyond Five-Generation (B5G) networks envision the use of machine learning (ML) methods to predict the network conditions and performance indicators in order to best make decisions and allocate resources. In this paper, we propose a new ML approach to accomplish predictions in B5G networks. Instead of handling the time-series in the network domain of values, we transform them into image thus allowing to apply advanced ML methods of Computer Vision field to reach better predictions in B5G networks. Particularly, we analyze different techniques to transform time-series of network measures into image representation, e.g., Recurrence Plots, Markov Transition Fields, and Gramian Angular Fields. Then, we apply deep neural networks with convolutional layers to predict different 5G radio signal quality indicators. When comparing with other ML-based solutions, experimental results from 5G transmission datasets showed the feasibility and small prediction errors of the proposed approach. △ Less

Submitted 17 April, 2021; originally announced April 2021.

arXiv:2104.05516 [pdf, other]

Machine-checked ZKP for NP-relations: Formally Verified Security Proofs and Implementations of MPC-in-the-Head

Authors: José Carlos Bacelar Almeida, Manuel Barbosa, Karim Eldefrawy, Stéphane Graham-Lengrand, Hugo Pacheco, Vitor Pereira

Abstract: MPC-in-the-Head (MitH) is a general framework that allows constructing efficient Zero Knowledge protocols for general NP-relations from secure multiparty computation (MPC) protocols. In this paper we give the first machine-checked implementation of this transformation. We begin with an EasyCrypt formalization of MitH that preserves the modular structure of MitH and can be instantiated with arbitra… ▽ More MPC-in-the-Head (MitH) is a general framework that allows constructing efficient Zero Knowledge protocols for general NP-relations from secure multiparty computation (MPC) protocols. In this paper we give the first machine-checked implementation of this transformation. We begin with an EasyCrypt formalization of MitH that preserves the modular structure of MitH and can be instantiated with arbitrary MPC protocols that satisfy standard notions of security, which allows us to leverage an existing machine-checked secret-sharing-based MPC protocol development. The resulting concrete ZK protocol is proved secure and correct in EasyCrypt. Using a recently developed code extraction mechanism for EasyCrypt we synthesize a formally verified implementation of the protocol, which we benchmark to get an indication of the overhead associated with our formalization choices and code extraction mechanism. △ Less

Submitted 19 May, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

arXiv:2104.00185 [pdf, ps, other]

doi 10.1007/978-3-030-93420-0_23

Less is More: Accelerating Faster Neural Networks Straight from JPEG

Authors: Samuel Felipe dos Santos, Jurandy Almeida

Abstract: Most image data available are often stored in a compressed format, from which JPEG is the most widespread. To feed this data on a convolutional neural network (CNN), a preliminary decoding process is required to obtain RGB pixels, demanding a high computational load and memory usage. For this reason, the design of CNNs for processing JPEG compressed data has gained attention in recent years. In mo… ▽ More Most image data available are often stored in a compressed format, from which JPEG is the most widespread. To feed this data on a convolutional neural network (CNN), a preliminary decoding process is required to obtain RGB pixels, demanding a high computational load and memory usage. For this reason, the design of CNNs for processing JPEG compressed data has gained attention in recent years. In most existing works, typical CNN architectures are adapted to facilitate the learning with the DCT coefficients rather than RGB pixels. Although they are effective, their architectural changes either raise the computational costs or neglect relevant information from DCT inputs. In this paper, we examine different ways of speeding up CNNs designed for DCT inputs, exploiting learning strategies to reduce the computational complexity by taking full advantage of DCT inputs. Our experiments were conducted on the ImageNet dataset. Results show that learning how to combine all DCT inputs in a data-driven fashion is better than discarding them by hand, and its combination with a reduction of layers has proven to be effective for reducing the computational costs while retaining accuracy. △ Less

Submitted 24 August, 2022; v1 submitted 31 March, 2021; originally announced April 2021.

Comments: arXiv admin note: text overlap with arXiv:2012.14426

Journal ref: in 2021 25th Iberoamerican Congress on Pattern Recognition (CIARP), 2021, pp. 237-247

arXiv:2103.07953 [pdf, other]

doi 10.1109/ACCESS.2021.3137633

A new interpretable unsupervised anomaly detection method based on residual explanation

Authors: David F. N. Oliveira, Lucio F. Vismari, Alexandre M. Nascimento, Jorge R. de Almeida Jr, Paulo S. Cugnasca, Joao B. Camargo Jr, Leandro Almeida, Rafael Gripp, Marcelo Neves

Abstract: Despite the superior performance in modeling complex patterns to address challenging problems, the black-box nature of Deep Learning (DL) methods impose limitations to their application in real-world critical domains. The lack of a smooth manner for enabling human reasoning about the black-box decisions hinder any preventive action to unexpected events, in which may lead to catastrophic consequenc… ▽ More Despite the superior performance in modeling complex patterns to address challenging problems, the black-box nature of Deep Learning (DL) methods impose limitations to their application in real-world critical domains. The lack of a smooth manner for enabling human reasoning about the black-box decisions hinder any preventive action to unexpected events, in which may lead to catastrophic consequences. To tackle the unclearness from black-box models, interpretability became a fundamental requirement in DL-based systems, leveraging trust and knowledge by providing ways to understand the model's behavior. Although a current hot topic, further advances are still needed to overcome the existing limitations of the current interpretability methods in unsupervised DL-based models for Anomaly Detection (AD). Autoencoders (AE) are the core of unsupervised DL-based for AD applications, achieving best-in-class performance. However, due to their hybrid aspect to obtain the results (by requiring additional calculations out of network), only agnostic interpretable methods can be applied to AE-based AD. These agnostic methods are computationally expensive to process a large number of parameters. In this paper we present the RXP (Residual eXPlainer), a new interpretability method to deal with the limitations for AE-based AD in large-scale systems. It stands out for its implementation simplicity, low computational cost and deterministic behavior, in which explanations are obtained through the deviation analysis of reconstructed input features. In an experiment using data from a real heavy-haul railway line, the proposed method achieved superior performance compared to SHAP, demonstrating its potential to support decision making in large scale critical systems. △ Less

Submitted 14 March, 2021; originally announced March 2021.

Comments: 8 pages

ACM Class: I.2.6; I.2.1

arXiv:2012.14426 [pdf, other]

CNNs for JPEGs: A Study in Computational Cost

Authors: Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida

Abstract: Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, defining state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from the RGB pixels. However, most image data are usually available in compressed format, from which the JPEG is the most widely used due to transmission and storage purpose… ▽ More Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, defining state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from the RGB pixels. However, most image data are usually available in compressed format, from which the JPEG is the most widely used due to transmission and storage purposes demanding a preliminary decoding process that have a high computational load and memory usage. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. Those methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNNs architectures to work with them. One limitation of these current works is that, in order to accommodate the frequency domain data, the modifications made to the original model increase significantly their amount of parameters and computational complexity. On one hand, the methods have faster preprocessing, since the cost of fully decoding the images is avoided, but on the other hand, the cost of passing the images though the model is increased, mitigating the possible upside of accelerating the method. In this paper, we propose a further study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing the images through the network. We also propose handcrafted and data-driven techniques for reducing the computational complexity and the number of parameters for these models in order to keep them similar to their RGB baselines, leading to efficient models with a better trade off between computational cost and accuracy. △ Less

Submitted 22 September, 2023; v1 submitted 26 December, 2020; originally announced December 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2012.13726

arXiv:2012.13726 [pdf, ps, other]

doi 10.1109/SIBGRAPI51738.2020.00017

Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain

Authors: Samuel Felipe dos Santos, Jurandy Almeida

Abstract: Human action recognition has become one of the most active field of research in computer vision due to its wide range of applications, like surveillance, medical, industrial environments, smart homes, among others. Recently, deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos. Most of the existing deep learning approaches ha… ▽ More Human action recognition has become one of the most active field of research in computer vision due to its wide range of applications, like surveillance, medical, industrial environments, smart homes, among others. Recently, deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos. Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences. For this reason, a preliminary decoding process is required, since video data are often stored in a compressed format. However, a high computational load and memory usage is demanded for decoding a video. To overcome this problem, we propose a deep neural network capable of learning straight from compressed video. Our approach was evaluated on two public benchmarks, the UCF-101 and HMDB-51 datasets, demonstrating comparable recognition performance to the state-of-the-art methods, with the advantage of running up to 2 times faster in terms of inference speed. △ Less

Submitted 26 December, 2020; originally announced December 2020.

Journal ref: in 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2020, pp. 62-68

arXiv:2005.06469 [pdf]

Representing Whole Slide Cancer Image Features with Hilbert Curves

Authors: Erich Bremer, Jonas Almeida, Joel Saltz

Abstract: Regions of Interest (ROI) contain morphological features in pathology whole slide images (WSI) are delimited with polygons[1]. These polygons are often represented in either a textual notation (with the array of edges) or in a binary mask form. Textual notations have an advantage of human readability and portability, whereas, binary mask representations are more useful as the input and output of f… ▽ More Regions of Interest (ROI) contain morphological features in pathology whole slide images (WSI) are delimited with polygons[1]. These polygons are often represented in either a textual notation (with the array of edges) or in a binary mask form. Textual notations have an advantage of human readability and portability, whereas, binary mask representations are more useful as the input and output of feature-extraction pipelines that employ deep learning methodologies. For any given whole slide image, more than a million cellular features can be segmented generating a corresponding number of polygons. The corpus of these segmentations for all processed whole slide images creates various challenges for filtering specific areas of data for use in interactive real-time and multi-scale displays and analysis. Simple range queries of image locations do not scale and, instead, spatial indexing schemes are required. In this paper we propose using Hilbert Curves simultaneously for spatial indexing and as a polygonal ROI representation. This is achieved by using a series of Hilbert Curves[2] creating an efficient and inherently spatially-indexed machine-usable form. The distinctive property of Hilbert curves that enables both mask and polygon delimitation of ROIs is that the elements of the vector extracted ro describe morphological features maintain their relative positions for different scales of the same image. △ Less

Submitted 13 May, 2020; originally announced May 2020.

Comments: 9 pages, 5 figures

arXiv:2005.02443 [pdf, other]

A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian and Indian Elections

Authors: Julio C. S. Reis, Philipe de Freitas Melo, Kiran Garimella, Jussara M. Almeida, Dean Eckles, Fabrício Benevenuto

Abstract: Recently, messaging applications, such as WhatsApp, have been reportedly abused by misinformation campaigns, especially in Brazil and India. A notable form of abuse in WhatsApp relies on several manipulated images and memes containing all kinds of fake stories. In this work, we performed an extensive data collection from a large set of WhatsApp publicly accessible groups and fact-checking agency w… ▽ More Recently, messaging applications, such as WhatsApp, have been reportedly abused by misinformation campaigns, especially in Brazil and India. A notable form of abuse in WhatsApp relies on several manipulated images and memes containing all kinds of fake stories. In this work, we performed an extensive data collection from a large set of WhatsApp publicly accessible groups and fact-checking agency websites. This paper opens a novel dataset to the research community containing fact-checked fake images shared through WhatsApp for two distinct scenarios known for the spread of fake news on the platform: the 2018 Brazilian elections and the 2019 Indian elections. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: 7 pages. This is a preprint version of an accepted paper on ICWSM'20. Please, consider to cite the conference version instead of this one

arXiv:2003.00509 [pdf, ps, other]

Profinite congruences and unary algebras

Authors: J. Almeida, O. Klíma

Abstract: Profinite congruences on profinite algebras determining profinite quotients are difficult to describe. In particular, no constructive description is known of the least profinite congruence containing a given binary relation on the algebra. On the other hand, closed congruences and fully invariant congruences can be described constructively. In a previous paper, we conjectured that fully invariant… ▽ More Profinite congruences on profinite algebras determining profinite quotients are difficult to describe. In particular, no constructive description is known of the least profinite congruence containing a given binary relation on the algebra. On the other hand, closed congruences and fully invariant congruences can be described constructively. In a previous paper, we conjectured that fully invariant closed congruences on a relatively free profinite algebra are always profinite. Here, we show that our conjecture fails for unary algebras and that closed congruences on relatively free profinite semigroups are not necessarily profinite. As part of our study of unary algebras, we establish an adjunction between profinite unary algebras and profinite monoids. We also show that the Polish representation of the free profinite unary algebra is faithful. △ Less

Submitted 1 March, 2020; originally announced March 2020.

MSC Class: 08A60; 08A62; 20M07; 20M05

arXiv:2001.00238 [pdf, other]

Low-Budget Label Query through Domain Alignment Enforcement

Authors: Jurandy Almeida, Cristiano Saltori, Paolo Rota, Nicu Sebe

Abstract: Deep learning revolution happened thanks to the availability of a massive amount of labelled data which have contributed to the development of models with extraordinary inference capabilities. Despite the public availability of a large quantity of datasets, to address specific requirements it is often necessary to generate a new set of labelled data. Quite often, the production of labels is costly… ▽ More Deep learning revolution happened thanks to the availability of a massive amount of labelled data which have contributed to the development of models with extraordinary inference capabilities. Despite the public availability of a large quantity of datasets, to address specific requirements it is often necessary to generate a new set of labelled data. Quite often, the production of labels is costly and sometimes it requires specific know-how to be fulfilled. In this work, we tackle a new problem named low-budget label query that consists in suggesting to the user a small (low budget) set of samples to be labelled, from a completely unlabelled dataset, with the final goal of maximizing the classification accuracy on that dataset. In this work we first improve an Unsupervised Domain Adaptation (UDA) method to better align source and target domains using consistency constraints, reaching the state of the art on a few UDA tasks. Finally, using the previously trained model as reference, we propose a simple yet effective selection method based on uniform sampling of the prediction consistency distribution, which is deterministic and steadily outperforms other baselines as well as competing models on a large variety of publicly available datasets. △ Less

Submitted 29 March, 2020; v1 submitted 1 January, 2020; originally announced January 2020.

Showing 1–50 of 96 results for author: Almeida, J