Search | arXiv e-print repository

Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques

Authors: Davide Clode da Silva, Marina Musse Bernardes, Nathalia Giacomini Ceretta, Gabriel Vaz de Souza, Gabriel Fonseca Silva, Rafael Heitor Bordini, Soraia Raupp Musse

Abstract: Machine learning has significantly advanced healthcare by aiding in disease prevention and treatment identification. However, accessing patient data can be challenging due to privacy concerns and strict regulations. Generating synthetic, realistic data offers a potential solution for overcoming these limitations, and recent studies suggest that fine-tuning foundation models can produce such data e… ▽ More Machine learning has significantly advanced healthcare by aiding in disease prevention and treatment identification. However, accessing patient data can be challenging due to privacy concerns and strict regulations. Generating synthetic, realistic data offers a potential solution for overcoming these limitations, and recent studies suggest that fine-tuning foundation models can produce such data effectively. In this study, we explore the potential of foundation models for generating realistic medical images, particularly chest x-rays, and assess how their performance improves with fine-tuning. We propose using a Latent Diffusion Model, starting with a pre-trained foundation model and refining it through various configurations. Additionally, we performed experiments with input from a medical professional to assess the realism of the images produced by each trained model. △ Less

Submitted 6 September, 2024; originally announced September 2024.

arXiv:2408.13084 [pdf, other]

Avatar Visual Similarity for Social HCI: Increasing Self-Awareness

Authors: Bernhard Hilpert, Claudio Alves da Silva, Leon Christidis, Chirag Bhuvaneshwara, Patrick Gebhard, Fabrizio Nunnari, Dimitra Tsovaltzi

Abstract: Self-awareness is a critical factor in social human-human interaction and, hence, in social HCI interaction. Increasing self-awareness through mirrors or video recordings is common in face-to-face trainings, since it influences antecedents of self-awareness like explicit identification and implicit affective identification (affinity). However, increasing self-awareness has been scarcely examined i… ▽ More Self-awareness is a critical factor in social human-human interaction and, hence, in social HCI interaction. Increasing self-awareness through mirrors or video recordings is common in face-to-face trainings, since it influences antecedents of self-awareness like explicit identification and implicit affective identification (affinity). However, increasing self-awareness has been scarcely examined in virtual trainings with virtual avatars, which allow for adjusting the similarity, e.g. to avoid negative effects of self-consciousness. Automatic visual similarity in avatars is an open issue related to high costs. It is important to understand which features need to be manipulated and which degree of similarity is necessary for self-awareness to leverage the added value of using avatars for self-awareness. This article examines the relationship between avatar visual similarity and increasing self-awareness in virtual training environments. We define visual similarity based on perceptually important facial features for human-human identification and develop a theory-based methodology to systematically manipulate visual similarity of virtual avatars and support self-awareness. Three personalized versions of virtual avatars with varying degrees of visual similarity to participants were created (weak, medium and strong facial features manipulation). In a within-subject study (N=33), we tested effects of degree of similarity on perceived similarity, explicit identification and implicit affective identification (affinity). Results show significant differences between the weak similarity manipulation, and both the strong manipulation and the random avatar for all three antecedents of self-awareness. An increasing degree of avatar visual similarity influences antecedents of self-awareness in virtual environments. △ Less

Submitted 23 August, 2024; originally announced August 2024.

arXiv:2407.19051 [pdf, other]

Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification

Authors: Bruna Bazaluk, Mosab Hamdan, Mustafa Ghaleb, Mohammed S. M. Gismalla, Flavio S. Correa da Silva, Daniel Macêdo Batista

Abstract: The classification of IoT traffic is important to improve the efficiency and security of IoT-based networks. As the state-of-the-art classification methods are based on Deep Learning, most of the current results require a large amount of data to be trained. Thereby, in real-life situations, where there is a scarce amount of IoT traffic data, the models would not perform so well. Consequently, thes… ▽ More The classification of IoT traffic is important to improve the efficiency and security of IoT-based networks. As the state-of-the-art classification methods are based on Deep Learning, most of the current results require a large amount of data to be trained. Thereby, in real-life situations, where there is a scarce amount of IoT traffic data, the models would not perform so well. Consequently, these models underperform outside their initial training conditions and fail to capture the complex characteristics of network traffic, rendering them inefficient and unreliable in real-world applications. In this paper, we propose IoT Traffic Classification Transformer (ITCT), a novel approach that utilizes the state-of-the-art transformer-based model named TabTransformer. ITCT, which is pre-trained on a large labeled MQTT-based IoT traffic dataset and may be fine-tuned with a small set of labeled data, showed promising results in various traffic classification tasks. Our experiments demonstrated that the ITCT model significantly outperforms existing models, achieving an overall accuracy of 82%. To support reproducibility and collaborative development, all associated code has been made publicly available. △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: Updated version of: B. Bazaluk, M. Hamdan, M. Ghaleb, M. S. M. Gismalla, F. S. Correa da Silva and D. M. Batista, "Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification," NOMS 2024-2024 IEEE Network Operations and Management Symposium, Seoul, Korea, Republic of, 2024, pp. 1-7, doi: 10.1109/NOMS59830.2024.10575448

arXiv:2407.02669 [pdf, other]

Impact of Network Deployment on the Performance of NCR-assisted Networks

Authors: Gabriel C. M. da Silva, Diego A. Sousa, Victor F. Monteiro, Darlan C. Moreira, Tarcisio F. Maciel, Fco. Rafael M. Lima, Behrooz Makki

Abstract: To address the need of coverage enhancement in the fifth generation (5G) of wireless cellular telecommunications, while taking into account possible bottlenecks related to deploying fiber based backhaul (e.g., required cost and time), the 3rd generation partnership project (3GPP) proposed in Release 18 the concept of network-controlled repeaters (NCRs). NCRs enhance previous radio frequency (RF) r… ▽ More To address the need of coverage enhancement in the fifth generation (5G) of wireless cellular telecommunications, while taking into account possible bottlenecks related to deploying fiber based backhaul (e.g., required cost and time), the 3rd generation partnership project (3GPP) proposed in Release 18 the concept of network-controlled repeaters (NCRs). NCRs enhance previous radio frequency (RF) repeaters by exploring beamforming transmissions controlled by the network through side control information. In this context, this paper introduces the concept of NCR. Furthermore, we present a system level model that allows the performance evaluation of an NCR-assisted network. Finally, we evaluate the network deployment impact on the performance of NCR-assisted networks. As we show, with proper network planning, NCRs can boost the signal to interference-plus-noise ratio (SINR) of the user equipments (UEs) in a poor coverage of a macro base station. Furthermore, celledge UEs and uplink (UL) communications are the ones that benefit the most from the presence of NCRs. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Paper accepted for publication in the conference proceedings of "19th International Symposium on Wireless Communication Systems" (ISWCS)

arXiv:2406.16241 [pdf, other]

Position: Benchmarking is Limited in Reinforcement Learning Research

Authors: Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas

Abstract: Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is… ▽ More Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is that conducting rigorous benchmarking experiments requires substantial computational time. This work investigates the sources of increased computation costs in rigorous experiment designs. We show that conducting rigorous performance benchmarks will likely have computational costs that are often prohibitive. As a result, we argue for using an additional experimentation paradigm to overcome the limitations of benchmarking. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 19 pages, 13 figures, The Forty-first International Conference on Machine Learning (ICML 2024)

arXiv:2406.04377 [pdf, other]

Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide Images

Authors: Ruiwen Ding, Kha-Dinh Luong, Erika Rodriguez, Ana Cristina Araujo Lemos da Silva, William Hsu

Abstract: In computational pathology, extracting spatial features from gigapixel whole slide images (WSIs) is a fundamental task, but due to their large size, WSIs are typically segmented into smaller tiles. A critical aspect of this analysis is aggregating information from these tiles to make predictions at the WSI level. We introduce a model that combines a message-passing graph neural network (GNN) with… ▽ More In computational pathology, extracting spatial features from gigapixel whole slide images (WSIs) is a fundamental task, but due to their large size, WSIs are typically segmented into smaller tiles. A critical aspect of this analysis is aggregating information from these tiles to make predictions at the WSI level. We introduce a model that combines a message-passing graph neural network (GNN) with a state space model (Mamba) to capture both local and global spatial relationships among the tiles in WSIs. The model's effectiveness was demonstrated in predicting progression-free survival among patients with early-stage lung adenocarcinomas (LUAD). We compared the model with other state-of-the-art methods for tile-level information aggregation in WSIs, including tile-level information summary statistics-based aggregation, multiple instance learning (MIL)-based aggregation, GNN-based aggregation, and GNN-transformer-based aggregation. Additional experiments showed the impact of different types of node features and different tile sampling strategies on the model performance. This work can be easily extended to any WSI-based analysis. Code: https://github.com/rina-ding/gat-mamba. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2405.01925 [pdf, other]

A Modular, Tendon Driven Variable Stiffness Manipulator with Internal Routing for Improved Stability and Increased Payload Capacity

Authors: Kyle L. Walker, Alix J. Partridge, Hsing-Yu Chen, Rahul R. Ramachandran, Adam A. Stokes, Kenjiro Tadakuma, Lucas Cruz da Silva, Francesco Giorgio-Serchi

Abstract: Stability and reliable operation under a spectrum of environmental conditions is still an open challenge for soft and continuum style manipulators. The inability to carry sufficient load and effectively reject external disturbances are two drawbacks which limit the scale of continuum designs, preventing widespread adoption of this technology. To tackle these problems, this work details the design… ▽ More Stability and reliable operation under a spectrum of environmental conditions is still an open challenge for soft and continuum style manipulators. The inability to carry sufficient load and effectively reject external disturbances are two drawbacks which limit the scale of continuum designs, preventing widespread adoption of this technology. To tackle these problems, this work details the design and experimental testing of a modular, tendon driven bead-style continuum manipulator with tunable stiffness. By embedding the ability to independently control the stiffness of distinct sections of the structure, the manipulator can regulate it's posture under greater loads of up to 1kg at the end-effector, with reference to the flexible state. Likewise, an internal routing scheme vastly improves the stability of the proximal segment when operating the distal segment, reducing deviations by at least 70.11%. Operation is validated when gravity is both tangential and perpendicular to the manipulator backbone, a feature uncommon in previous designs. The findings presented in this work are key to the development of larger scale continuum designs, demonstrating that flexibility and tip stability under loading can co-exist without compromise. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: To be presented at ICRA 2024, Yokohama, Japan. 6 pages

arXiv:2404.08555 [pdf, other]

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

Authors: Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva

Abstract: State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hal… ▽ More State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hallucinations. Yet, an understanding of RLHF for LLMs is largely entangled with initial design choices that popularized the method and current research focuses on augmenting those choices rather than fundamentally improving the framework. In this paper, we analyze RLHF through the lens of reinforcement learning principles to develop an understanding of its fundamentals, dedicating substantial focus to the core component of RLHF -- the reward model. Our study investigates modeling choices, caveats of function approximation, and their implications on RLHF training algorithms, highlighting the underlying assumptions made about the expressivity of reward. Our analysis improves the understanding of the role of reward models and methods for their training, concurrently revealing limitations of the current methodology. We characterize these limitations, including incorrect generalization, model misspecification, and the sparsity of feedback, along with their impact on the performance of a language model. The discussion and analysis are substantiated by a categorical review of current literature, serving as a reference for researchers and practitioners to understand the challenges of RLHF and build upon existing efforts. △ Less

Submitted 15 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

arXiv:2403.06164 [pdf, other]

Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation

Authors: Paweł A. Pierzchlewicz, Caio da Silva, R. James Cotton, Fabian H. Sinz

Abstract: Single camera 3D pose estimation is an ill-defined problem due to inherent ambiguities from depth, occlusion or keypoint noise. Multi-hypothesis pose estimation accounts for this uncertainty by providing multiple 3D poses consistent with the 2D measurements. Current research has predominantly concentrated on generating multiple hypotheses for single frame static pose estimation. In this study we f… ▽ More Single camera 3D pose estimation is an ill-defined problem due to inherent ambiguities from depth, occlusion or keypoint noise. Multi-hypothesis pose estimation accounts for this uncertainty by providing multiple 3D poses consistent with the 2D measurements. Current research has predominantly concentrated on generating multiple hypotheses for single frame static pose estimation. In this study we focus on the new task of multi-hypothesis motion estimation. Motion estimation is not simply pose estimation applied to multiple frames, which would ignore temporal correlation across frames. Instead, it requires distributions which are capable of generating temporally consistent samples, which is significantly more challenging. To this end, we introduce Platypose, a framework that uses a diffusion model pretrained on 3D human motion sequences for zero-shot 3D pose sequence estimation. Platypose outperforms baseline methods on multiple hypotheses for motion estimation. Additionally, Platypose also achieves state-of-the-art calibration and competitive joint error when tested on static poses from Human3.6M, MPI-INF-3DHP and 3DPW. Finally, because it is zero-shot, our method generalizes flexibly to different settings such as multi-camera inference. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2402.16968 [pdf, ps, other]

A Survey of Large Language Models in Cybersecurity

Authors: Gabriel de Jesus Coelho da Silva, Carlos Becker Westphall

Abstract: Large Language Models (LLMs) have quickly risen to prominence due to their ability to perform at or close to the state-of-the-art in a variety of fields while handling natural language. An important field of research is the application of such models at the cybersecurity context. This survey aims to identify where in the field of cybersecurity LLMs have already been applied, the ways in which they… ▽ More Large Language Models (LLMs) have quickly risen to prominence due to their ability to perform at or close to the state-of-the-art in a variety of fields while handling natural language. An important field of research is the application of such models at the cybersecurity context. This survey aims to identify where in the field of cybersecurity LLMs have already been applied, the ways in which they are being used and their limitations in the field. Finally, suggestions are made on how to improve such limitations and what can be expected from these systems once these limitations are overcome. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2401.16182 [pdf, other]

LLaMandement: Large Language Models for Summarization of French Legislative Proposals

Authors: Joseph Gesnouin, Yannis Tannier, Christophe Gomes Da Silva, Hatim Tapory, Camille Brier, Hugo Simon, Raphael Rozenberg, Hermann Woehrel, Mehdi El Yakaabi, Thomas Binder, Guillaume Marie, Emilie Caron, Mathile Nogueira, Thomas Fontas, Laure Puydebois, Marie Theophile, Stephane Morandi, Mael Petit, David Creissac, Pauline Ennouchy, Elise Valetoux, Celine Visade, Severine Balloux, Emmanuel Cortes, Pierre-Etienne Devineau , et al. (3 additional authors not shown)

Abstract: This report introduces LLaMandement, a state-of-the-art Large Language Model, fine-tuned by the French government and designed to enhance the efficiency and efficacy of processing parliamentary sessions (including the production of bench memoranda and documents required for interministerial meetings) by generating neutral summaries of legislative proposals. Addressing the administrative challenges… ▽ More This report introduces LLaMandement, a state-of-the-art Large Language Model, fine-tuned by the French government and designed to enhance the efficiency and efficacy of processing parliamentary sessions (including the production of bench memoranda and documents required for interministerial meetings) by generating neutral summaries of legislative proposals. Addressing the administrative challenges of manually processing a growing volume of legislative amendments, LLaMandement stands as a significant legal technological milestone, providing a solution that exceeds the scalability of traditional human efforts while matching the robustness of a specialized legal drafter. We release all our fine-tuned models and training data to the community. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 21 pages, 9 figures

arXiv:2312.12972 [pdf, other]

From Past to Future: Rethinking Eligibility Traces

Authors: Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva

Abstract: In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment to preceding states. From this investigation emerges the concept of a novel value function, which we refer to as the \emph{bidirectional value functio… ▽ More In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment to preceding states. From this investigation emerges the concept of a novel value function, which we refer to as the \emph{bidirectional value function}. Unlike traditional state value functions, bidirectional value functions account for both future expected returns (rewards anticipated from the current state onward) and past expected returns (cumulative rewards from the episode's start to the present). We derive principled update equations to learn this value function and, through experimentation, demonstrate its efficacy in enhancing the process of policy evaluation. In particular, our results indicate that the proposed learning approach can, in certain challenging contexts, perform policy evaluation more rapidly than TD($λ$) -- a method that learns forward value functions, $v^π$, \emph{directly}. Overall, our findings present a new perspective on eligibility traces and potential advantages associated with the novel value function it inspires, especially for policy evaluation. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: Accepted in The 38th Annual AAAI Conference on Artificial Intelligence

arXiv:2311.17068 [pdf, other]

Deep convolutional encoder-decoder hierarchical neural networks for conjugate heat transfer surrogate modeling

Authors: Takiah Ebbs-Picken, David A. Romero, Carlos M. Da Silva, Cristina H. Amon

Abstract: Conjugate heat transfer (CHT) models are vital for the design of many engineering systems. However, high-fidelity CHT models are computationally intensive, which limits their use in applications such as design optimization, where hundreds to thousands of model evaluations are required. In this work, we develop a modular deep convolutional encoder-decoder hierarchical (DeepEDH) neural network, a no… ▽ More Conjugate heat transfer (CHT) models are vital for the design of many engineering systems. However, high-fidelity CHT models are computationally intensive, which limits their use in applications such as design optimization, where hundreds to thousands of model evaluations are required. In this work, we develop a modular deep convolutional encoder-decoder hierarchical (DeepEDH) neural network, a novel deep-learning-based surrogate modeling methodology for computationally intensive CHT models. Leveraging convective temperature dependencies, we propose a two-stage temperature prediction architecture that couples velocity and temperature models. The proposed DeepEDH methodology is demonstrated by modeling the pressure, velocity, and temperature fields for a liquid-cooled cold-plate-based battery thermal management system with variable channel geometry. A computational model of the cold plate is developed and solved using the finite element method (FEM), generating a dataset of 1,500 simulations. The FEM results are transformed and scaled from unstructured to structured, image-like meshes to create training and test datasets. The DeepEDH methodology's performance is examined in relation to data scaling, training dataset size, and network depth. Our performance analysis covers the impact of the novel architecture, separate field models, output geometry masks, multi-stage temperature models, and optimizations of the hyperparameters and architecture. Furthermore, we quantify the influence of the CHT thermal boundary condition on surrogate model performance, highlighting improved temperature model performance with higher heat fluxes. Compared to other deep learning neural network surrogate models, such as U-Net and DenseED, the proposed DeepEDH methodology for CHT models exhibits up to a 65% enhancement in the coefficient of determination ($R^{2}$). △ Less

Submitted 24 November, 2023; originally announced November 2023.

arXiv:2310.19007 [pdf, other]

Behavior Alignment via Reward Function Optimization

Authors: Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno Castro da Silva

Abstract: Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadvertently inducing undesirable behaviors. Naively modifying the reward structure to offer denser and more frequent feedback can lead to unintended outco… ▽ More Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadvertently inducing undesirable behaviors. Naively modifying the reward structure to offer denser and more frequent feedback can lead to unintended outcomes and promote behaviors that are not aligned with the designer's intended goal. Although potential-based reward shaping is often suggested as a remedy, we systematically investigate settings where deploying it often significantly impairs performance. To address these issues, we introduce a new framework that uses a bi-level objective to learn \emph{behavior alignment reward functions}. These functions integrate auxiliary rewards reflecting a designer's heuristics and domain knowledge with the environment's primary rewards. Our approach automatically determines the most effective way to blend these types of feedback, thereby enhancing robustness against heuristic reward misspecification. Remarkably, it can also adapt an agent's policy optimization process to mitigate suboptimalities resulting from limitations and biases inherent in the underlying RL algorithms. We evaluate our method's efficacy on a diverse set of tasks, from small-scale experiments to high-dimensional control challenges. We investigate heuristic auxiliary rewards of varying quality -- some of which are beneficial and others detrimental to the learning process. Our results show that our framework offers a robust and principled way to integrate designer-specified heuristics. It not only addresses key shortcomings of existing approaches but also consistently leads to high-performing solutions, even when given misaligned or poorly-specified auxiliary reward functions. △ Less

Submitted 31 October, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

Comments: (Spotlight) Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2309.00176 [pdf, other]

Parallel Distributional Prioritized Deep Reinforcement Learning for Unmanned Aerial Vehicles

Authors: Alisson Henrique Kolling, Victor Augusto Kich, Junior Costa de Jesus, Andressa Cavalcante da Silva, Ricardo Bedin Grando, Paulo Lilles Jorge Drews-Jr, Daniel F. T. Gamarra

Abstract: This work presents a study on parallel and distributional deep reinforcement learning applied to the mapless navigation of UAVs. For this, we developed an approach based on the Soft Actor-Critic method, producing a distributed and distributional variant named PDSAC, and compared it with a second one based on the traditional SAC algorithm. In addition, we also embodied a prioritized memory system i… ▽ More This work presents a study on parallel and distributional deep reinforcement learning applied to the mapless navigation of UAVs. For this, we developed an approach based on the Soft Actor-Critic method, producing a distributed and distributional variant named PDSAC, and compared it with a second one based on the traditional SAC algorithm. In addition, we also embodied a prioritized memory system into them. The UAV used in the study is based on the Hydrone vehicle, a hybrid quadrotor operating solely in the air. The inputs for the system are 23 range findings from a Lidar sensor and the distance and angles towards a desired goal, while the outputs consist of the linear, angular, and, altitude velocities. The methods were trained in environments of varying complexity, from obstacle-free environments to environments with multiple obstacles in three dimensions. The results obtained, demonstrate a concise improvement in the navigation capabilities by the proposed approach when compared to the agent based on the SAC for the same amount of training steps. In summary, this work presented a study on deep reinforcement learning applied to mapless navigation of drones in three dimensions, with promising results and potential applications in various contexts related to robotics and autonomous air navigation with distributed and distributional variants. △ Less

Submitted 31 August, 2023; originally announced September 2023.

Comments: 7 pages, 6 figures. Approved at LARS 2023

arXiv:2307.10018 [pdf, other]

RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023

Authors: Aline Lima de Oliveira, Cauê Addae da Silva Gomes, Cecília Virginia Santos da Silva, Charles Matheus de Sousa Alves, Danilo Andrade Martins de Souza, Driele Pires Ferreira Araújo Xavier, Edgleyson Pereira da Silva, Felipe Bezerra Martins, Lucas Henrique Cavalcanti Santos, Lucas Dias Maciel, Matheus Paixão Gumercindo dos Santos, Matheus Lafayette Vasconcelos, Matheus Vinícius Teotonio do Nascimento Andrade, João Guilherme Oliveira Carvalho de Melo, João Pedro Souza Pereira de Moura, José Ronald da Silva, José Victor Silva Cruz, Pedro Henrique Santana de Morais, Pedro Paulo Salman de Oliveira, Riei Joaquim Matos Rodrigues, Roberto Costa Fernandes, Ryan Vinicius Santos Morais, Tamara Mayara Ramos Teobaldo, Washington Igor dos Santos Silva, Edna Natividade Silva Barros

Abstract: RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou… ▽ More RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Our team has successfully published 2 articles related to SSL at two high-impact conferences: the 25th RoboCup International Symposium and the 19th IEEE Latin American Robotics Symposium (LARS 2022). Over the last year, we have been continuously migrating from our past codebase to Unification. We will describe the new architecture implemented and some points of software and AI refactoring. In addition, we discuss the process of integrating machined components into the mechanical system, our development for participating in the vision blackout challenge last year and what we are preparing for this year. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2305.09838 [pdf, other]

Coagent Networks: Generalized and Scaled

Authors: James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas

Abstract: Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpropagation-based deep learning (BDL) that overcomes some of backpropagation's main limitations. For example, coagent networks can compute different par… ▽ More Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpropagation-based deep learning (BDL) that overcomes some of backpropagation's main limitations. For example, coagent networks can compute different parts of the network \emph{asynchronously} (at different rates or at different times), can incorporate non-differentiable components that cannot be used with backpropagation, and can explore at levels higher than their action spaces (that is, they can be designed as hierarchical networks for exploration and/or temporal abstraction). However, the coagent framework is not just an alternative to BDL; the two approaches can be blended: BDL can be combined with coagent learning rules to create architectures with the advantages of both approaches. This work generalizes the coagent theory and learning rules provided by previous works; this generalization provides more flexibility for network architecture design within the coagent framework. This work also studies one of the chief disadvantages of coagent networks: high variance updates for networks that have many coagents and do not use backpropagation. We show that a coagent algorithm with a policy network that does not use backpropagation can scale to a challenging RL domain with a high-dimensional state and action space (the MuJoCo Ant environment), learning reasonable (although not state-of-the-art) policies. These contributions motivate and provide a more general theoretical foundation for future work that studies coagent networks. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2301.11173 [pdf, other]

Double Deep Reinforcement Learning Techniques for Low Dimensional Sensing Mapless Navigation of Terrestrial Mobile Robots

Authors: Linda Dotto de Moraes, Victor Augusto Kich, Alisson Henrique Kolling, Jair Augusto Bottega, Raul Steinmetz, Emerson Cassiano da Silva, Ricardo Bedin Grando, Anselmo Rafael Cuckla, Daniel Fernando Tello Gamarra

Abstract: In this work, we present two Deep Reinforcement Learning (Deep-RL) approaches to enhance the problem of mapless navigation for a terrestrial mobile robot. Our methodology focus on comparing a Deep-RL technique based on the Deep Q-Network (DQN) algorithm with a second one based on the Double Deep Q-Network (DDQN) algorithm. We use 24 laser measurement samples and the relative position and angle of… ▽ More In this work, we present two Deep Reinforcement Learning (Deep-RL) approaches to enhance the problem of mapless navigation for a terrestrial mobile robot. Our methodology focus on comparing a Deep-RL technique based on the Deep Q-Network (DQN) algorithm with a second one based on the Double Deep Q-Network (DDQN) algorithm. We use 24 laser measurement samples and the relative position and angle of the agent to the target as information for our agents, which provide the actions as velocities for our robot. By using a low-dimensional sensing structure of learning, we show that it is possible to train an agent to perform navigation-related tasks and obstacle avoidance without using complex sensing information. The proposed methodology was successfully used in three distinct simulated environments. Overall, it was shown that Double Deep structures further enhance the problem for the navigation of mobile robots when compared to the ones with simple Q structures. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Journal ref: International Conference on Intelligent Systems Design and Applications, 2022

arXiv:2301.10330 [pdf, other]

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Authors: Yash Chandak, Shiv Shankar, Nathaniel D. Bastian, Bruno Castro da Silva, Emma Brunskil, Philip S. Thomas

Abstract: Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to external factors (passive non-stationarity), changes induced by interactions with the system itself (active non-stationarity), or both (hybrid non-station… ▽ More Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to external factors (passive non-stationarity), changes induced by interactions with the system itself (active non-stationarity), or both (hybrid non-stationarity). In this work, we take the first steps towards the fundamental challenge of on-policy and off-policy evaluation amidst structured changes due to active, passive, or hybrid non-stationarity. Towards this goal, we make a higher-order stationarity assumption such that non-stationarity results in changes over time, but the way changes happen is fixed. We propose, OPEN, an algorithm that uses a double application of counterfactual reasoning and a novel importance-weighted instrument-variable regression to obtain both a lower bias and a lower variance estimate of the structure in the changes of a policy's past performances. Finally, we show promising results on how OPEN can be used to predict future performances for several domains inspired by real-world applications that exhibit non-stationarity. △ Less

Submitted 24 January, 2023; originally announced January 2023.

Comments: Accepted at Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2301.07784 [pdf, other]

doi 10.5555/3545946.3598872

Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization

Authors: Lucas N. Alegre, Ana L. C. Bazzan, Diederik M. Roijers, Ann Nowé, Bruno C. da Silva

Abstract: Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each optimized for a particular agent preference) that can later be used to solve problems with novel preferences. We introduce a novel algorithm that uses Generalized Po… ▽ More Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each optimized for a particular agent preference) that can later be used to solve problems with novel preferences. We introduce a novel algorithm that uses Generalized Policy Improvement (GPI) to define principled, formally-derived prioritization schemes that improve sample-efficient learning. They implement active-learning strategies by which the agent can (i) identify the most promising preferences/objectives to train on at each moment, to more rapidly solve a given MORL problem; and (ii) identify which previous experiences are most relevant when learning a policy for a particular agent preference, via a novel Dyna-style MORL method. We prove our algorithm is guaranteed to always converge to an optimal solution in a finite number of steps, or an $ε$-optimal solution (for a bounded $ε$) if the agent is limited and can only identify possibly sub-optimal policies. We also prove that our method monotonically improves the quality of its partial solutions while learning. Finally, we introduce a bound that characterizes the maximum utility loss (with respect to the optimal solution) incurred by the partial solutions computed by our method throughout learning. We empirically show that our method outperforms state-of-the-art MORL algorithms in challenging multi-objective tasks, both with discrete and continuous state and action spaces. △ Less

Submitted 23 March, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

Comments: Accepted to AAMAS 2023

arXiv:2212.10707 [pdf, other]

doi 10.5220/0011664100003417

Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection

Authors: Vinícius Camargo da Silva, João Paulo Papa, Kelton Augusto Pontara da Costa

Abstract: Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summar… ▽ More Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summarization and the importance it may have for evolving the current state of the ATS field, this work studies the application of two modern Generalized Additive Models with interactions, namely Explainable Boosting Machine and GAMI-Net, to the extractive summarization problem based on linguistic features and binary classification. △ Less

Submitted 20 December, 2022; originally announced December 2022.

arXiv:2210.06858 [pdf, other]

doi 10.1002/spe.3169

Adopting Microservices and DevOps in the Cyber-Physical Systems Domain: A Rapid Review and Case Study

Authors: Jonas Fritzsch, Justus Bogner, Markus Haug, Ana Cristina Franco da Silva, Carolin Rubner, Matthias Saft, Horst Sauer, Stefan Wagner

Abstract: The domain of cyber-physical systems (CPS) has recently seen strong growth, e.g., due to the rise of the Internet of Things (IoT) in industrial domains, commonly referred to as "Industry 4.0". However, CPS challenges like the strong hardware focus can impact modern software development practices, especially in the context of modernizing legacy systems. While microservices and DevOps have been wide… ▽ More The domain of cyber-physical systems (CPS) has recently seen strong growth, e.g., due to the rise of the Internet of Things (IoT) in industrial domains, commonly referred to as "Industry 4.0". However, CPS challenges like the strong hardware focus can impact modern software development practices, especially in the context of modernizing legacy systems. While microservices and DevOps have been widely studied for enterprise applications, there is insufficient coverage for the CPS domain. Our goal is therefore to analyze the peculiarities of such systems regarding challenges and practices for using and migrating towards microservices and DevOps. We conducted a rapid review based on 146 scientific papers, and subsequently validated our findings in an interview-based case study with 9 CPS professionals in different business units at Siemens AG. The combined results picture the specifics of microservices and DevOps in the CPS domain. While several differences were revealed that may require adapted methods, many challenges and practices are shared with typical enterprise applications. Our study supports CPS researchers and practitioners with a summary of challenges, practices to address them, and research opportunities. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: 10 pages, 8 figures, accepted for publication at "Software: Practice and Experience - Wiley Online Library"

arXiv:2208.14501 [pdf, other]

Model-Based Reinforcement Learning with SINDy

Authors: Rushiv Arora, Bruno Castro da Silva, Eliot Moss

Abstract: We draw on the latest advancements in the physics community to propose a novel method for discovering the governing non-linear dynamics of physical systems in reinforcement learning (RL). We establish that this method is capable of discovering the underlying dynamics using significantly fewer trajectories (as little as one rollout with $\leq 30$ time steps) than state of the art model learning alg… ▽ More We draw on the latest advancements in the physics community to propose a novel method for discovering the governing non-linear dynamics of physical systems in reinforcement learning (RL). We establish that this method is capable of discovering the underlying dynamics using significantly fewer trajectories (as little as one rollout with $\leq 30$ time steps) than state of the art model learning algorithms. Further, the technique learns a model that is accurate enough to induce near-optimal policies given significantly fewer trajectories than those required by model-free algorithms. It brings the benefits of model-based RL without requiring a model to be developed in advance, for systems that have physics-based dynamics. To establish the validity and applicability of this algorithm, we conduct experiments on four classic control tasks. We found that an optimal policy trained on the discovered dynamics of the underlying system can generalize well. Further, the learned policy performs well when deployed on the actual physical system, thus bridging the model to real system gap. We further compare our method to state-of-the-art model-based and model-free approaches, and show that our method requires fewer trajectories sampled on the true physical system compared other methods. Additionally, we explored approximate dynamics models and found that they also can perform well. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: 8 pages, 1 figure, 1 table, 1 algorithm, presented at the Decision Awareness in Reinforcement Learning workshop held at the International Conference on Machine Learning, 22 July 2022, Baltimore MD, USA

arXiv:2208.11744 [pdf, other]

Enforcing Delayed-Impact Fairness Guarantees

Authors: Aline Weber, Blossom Metevier, Yuriy Brun, Philip S. Thomas, Bruno Castro da Silva

Abstract: Recent research has shown that seemingly fair machine learning models, when used to inform decisions that have an impact on peoples' lives or well-being (e.g., applications involving education, employment, and lending), can inadvertently increase social inequality in the long term. This is because prior fairness-aware algorithms only consider static fairness constraints, such as equal opportunity… ▽ More Recent research has shown that seemingly fair machine learning models, when used to inform decisions that have an impact on peoples' lives or well-being (e.g., applications involving education, employment, and lending), can inadvertently increase social inequality in the long term. This is because prior fairness-aware algorithms only consider static fairness constraints, such as equal opportunity or demographic parity. However, enforcing constraints of this type may result in models that have negative long-term impact on disadvantaged individuals and communities. We introduce ELF (Enforcing Long-term Fairness), the first classification algorithm that provides high-confidence fairness guarantees in terms of long-term, or delayed, impact. We prove that the probability that ELF returns an unfair solution is less than a user-specified tolerance and that (under mild assumptions), given sufficient training data, ELF is able to find and return a fair solution if one exists. We show experimentally that our algorithm can successfully mitigate long-term unfairness. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: 24 pages, 5 figures

arXiv:2207.08007 [pdf, other]

A family of counterexamples for a conjecture of Berge on $α$-diperfect digraphs

Authors: Caroline Aparecida de Paula Silva, Cândida Nunes da Silva, Orlando Lee

Abstract: Let $D$ be a digraph. A stable set $S$ of $D$ and a path partition $\mathcal{P}$ of $D$ are orthogonal if every path $P \in \mathcal{P}$ contains exactly one vertex of $S$. In 1982, Berge defined the class of $α$-diperfect digraphs. A digraph $D$ is $α$-diperfect if for every maximum stable set $S$ of $D$ there is a path partition $\mathcal{P}$ of $D$ orthogonal to $S$ and this property holds for… ▽ More Let $D$ be a digraph. A stable set $S$ of $D$ and a path partition $\mathcal{P}$ of $D$ are orthogonal if every path $P \in \mathcal{P}$ contains exactly one vertex of $S$. In 1982, Berge defined the class of $α$-diperfect digraphs. A digraph $D$ is $α$-diperfect if for every maximum stable set $S$ of $D$ there is a path partition $\mathcal{P}$ of $D$ orthogonal to $S$ and this property holds for every induced subdigraph of $D$. An anti-directed odd cycle is an orientation of an odd cycle $(x_0,\ldots,x_{2k},x_0)$ with $k\geq2$ in which each vertex $x_0,x_1,x_2,x_3,x_5,x_7\ldots,x_{2k-1}$ is either a source or a sink. Berge conjectured that a digraph $D$ is $α$-diperfect if and only if $D$ does not contain an anti-directed odd cycle as an induced subdigraph. In this paper, we show that this conjecture is false by exhibiting an infinite family of orientations of complements of odd cycles with at least seven vertices that are not $α$-diperfect. △ Less

Submitted 28 July, 2022; v1 submitted 16 July, 2022; originally announced July 2022.

arXiv:2207.03225 [pdf, other]

Towards Immediate Feedback for Security Relevant Code in Development Environments

Authors: Markus Haug Ana Cristina Franco Da Silva, Stefan Wagner

Abstract: Nowadays, the correct use of cryptography libraries is essential to ensure the necessary information security in different kinds of applications. A common practice in software development is the use of static application security testing (SAST) tools to analyze code regarding security vulnerabilities. Most of these tools are designed to run separately from development environments. Their results a… ▽ More Nowadays, the correct use of cryptography libraries is essential to ensure the necessary information security in different kinds of applications. A common practice in software development is the use of static application security testing (SAST) tools to analyze code regarding security vulnerabilities. Most of these tools are designed to run separately from development environments. Their results are extensive lists of security notifications, which software developers have to inspect manually in a time-consuming follow-up step. To support developers in their tasks of developing secure code, we present an approach for providing them with continuous immediate feedback of SAST tools in integrated development environments (IDEs). Our approach also considers the understandability of security notifications and aims for a user-centered approach that leverages developers' feedback to build an adaptive system tailored to each individual developer. △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: submitted to the 16th Symposium and Summer School On Service-Oriented Computing 2022

arXiv:2207.00748 [pdf, other]

doi 10.1007/s10032-022-00406-7

Sequence-aware multimodal page classification of Brazilian legal documents

Authors: Pedro H. Luz de Araujo, Ana Paula G. S. de Almeida, Fabricio A. Braz, Nilton C. da Silva, Flavio de Barros Vidal, Teofilo E. de Campos

Abstract: The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate ou… ▽ More The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate our methods on a novel multimodal dataset of 6,510 lawsuits (339,478 pages) with manual annotation assigning each page to one of six classes. Each lawsuit is an ordered sequence of pages, which are stored both as an image and as a corresponding text extracted through optical character recognition. We first train two unimodal classifiers: a ResNet pre-trained on ImageNet is fine-tuned on the images, and a convolutional network with filters of multiple kernel sizes is trained from scratch on document texts. We use them as extractors of visual and textual features, which are then combined through our proposed Fusion Module. Our Fusion Module can handle missing textual or visual input by using learned embeddings for missing data. Moreover, we experiment with bi-directional Long Short-Term Memory (biLSTM) networks and linear-chain conditional random fields to model the sequential nature of the pages. The multimodal approaches outperform both textual and visual classifiers, especially when leveraging the sequential nature of the pages. △ Less

Submitted 15 July, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

Comments: 11 pages, 6 figures. This preprint, which was originally written on 8 April 2021, has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in the International Journal on Document Analysis and Recognition, and is available online at https://doi.org/10.1007/s10032-022-00406-7 and https://rdcu.be/cRvvV

Journal ref: International Journal on Document Analysis and Recognition.2022

arXiv:2206.12293 [pdf, other]

Text and author-level political inference using heterogeneous knowledge representations

Authors: Samuel Caetano da Silva, Ivandre Paraboni

Abstract: The inference of politically-charged information from text data is a popular research topic in Natural Language Processing (NLP) at both text- and author-level. In recent years, studies of this kind have been implemented with the aid of representations from transformers such as BERT. Despite considerable success, however, we may ask whether results may be improved even further by combining transfo… ▽ More The inference of politically-charged information from text data is a popular research topic in Natural Language Processing (NLP) at both text- and author-level. In recent years, studies of this kind have been implemented with the aid of representations from transformers such as BERT. Despite considerable success, however, we may ask whether results may be improved even further by combining transformed-based models with additional knowledge representations. To shed light on this issue, the present work describes a series of experiments to compare alternative model configurations for political inference from text in both English and Portuguese languages. Results suggest that certain text representations - in particular, the combined use of BERT pre-trained language models with a syntactic dependency model - may outperform the alternatives across multiple experimental settings, making a potentially strong case for further research in the use of heterogeneous text representations in these and possibly other NLP tasks. △ Less

Submitted 29 July, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

arXiv:2206.11326 [pdf, other]

Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Authors: Lucas N. Alegre, Ana L. C. Bazzan, Bruno C. da Silva

Abstract: In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set of policies for different tasks, successor features (SFs) can be exploited to combine such policies and identify reasonable solutions for new problems. However… ▽ More In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set of policies for different tasks, successor features (SFs) can be exploited to combine such policies and identify reasonable solutions for new problems. However, the identified solutions are not guaranteed to be optimal. We introduce a novel algorithm that addresses this limitation. It allows RL agents to combine existing policies and directly identify optimal policies for arbitrary new problems, without requiring any further interactions with the environment. We first show (under mild assumptions) that the transfer learning problem tackled by SFs is equivalent to the problem of learning to optimize multiple objectives in RL. We then introduce an SF-based extension of the Optimistic Linear Support algorithm to learn a set of policies whose SFs form a convex coverage set. We prove that policies in this set can be combined via generalized policy improvement to construct optimal behaviors for any new linearly-expressible tasks, without requiring any additional training samples. We empirically show that our method outperforms state-of-the-art competing algorithms both in discrete and continuous domains under value function approximation. △ Less

Submitted 22 June, 2022; originally announced June 2022.

Comments: Proceedings of the 39th International Conference on Machine Learning (ICML'22)

arXiv:2204.13857 [pdf]

Equine radiograph classification using deep convolutional neural networks

Authors: Raniere Gaia Costa da Silva, Ambika Prasad Mishra, Christopher Riggs, Michael Doube

Abstract: Purpose: To assess the capability of deep convolutional neural networks to classify anatomical location and projection from a series of 48 standard views of racehorse limbs. Materials and Methods: 9504 equine pre-import radiographs were used to train, validate, and test six deep learning architectures available as part of the open source machine learning framework PyTorch. Results: ResNet-34 a… ▽ More Purpose: To assess the capability of deep convolutional neural networks to classify anatomical location and projection from a series of 48 standard views of racehorse limbs. Materials and Methods: 9504 equine pre-import radiographs were used to train, validate, and test six deep learning architectures available as part of the open source machine learning framework PyTorch. Results: ResNet-34 achieved a top-1 accuracy of 0.8408 and the majority (88%) of misclassification was because of wrong laterality. Class activation maps indicated that joint morphology drove the model decision. Conclusion: Deep convolutional neural networks are capable of classifying equine pre-import radiographs into the 48 standard views including moderate discrimination of laterality independent of side marker presence. △ Less

Submitted 28 April, 2022; originally announced April 2022.

arXiv:2204.03706 [pdf, other]

Introducing a Framework and a Decision Protocol to Calibrate Recommender Systems

Authors: Diego Corrêa da Silva, Frederico Araújo Durão

Abstract: Recommender Systems use the user's profile to generate a recommendation list with unknown items to a target user. Although the primary goal of traditional recommendation systems is to deliver the most relevant items, such an effort unintentionally can cause collateral effects including low diversity and unbalanced genres or categories, benefiting particular groups of categories. This paper propose… ▽ More Recommender Systems use the user's profile to generate a recommendation list with unknown items to a target user. Although the primary goal of traditional recommendation systems is to deliver the most relevant items, such an effort unintentionally can cause collateral effects including low diversity and unbalanced genres or categories, benefiting particular groups of categories. This paper proposes an approach to create recommendation lists with a calibrated balance of genres, avoiding disproportion between the user's profile interests and the recommendation list. The calibrated recommendations consider concomitantly the relevance and the divergence between the genres distributions extracted from the user's preference and the recommendation list. The main claim is that calibration can contribute positively to generate fairer recommendations. In particular, we propose a new trade-off equation, which considers the users' bias to provide a recommendation list that seeks for the users' tendencies. Moreover, we propose a conceptual framework and a decision protocol to generate more than one thousand combinations of calibrated systems in order to find the best combination. We compare our approach against state-of-the-art approaches using multiple domain datasets, which are analyzed by rank and calibration metrics. The results indicate that the trade-off, which considers the users' bias, produces positive effects on the precision and to the fairness, thus generating recommendation lists that respect the genre distribution and, through the decision protocol, we also found the best system for each dataset. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: 12 Tables and 5 figures. Submitted to a journal

arXiv:2203.12600 [pdf, other]

doi 10.13140/RG.2.2.17415.47520

Standing Forest Coin (SFC)

Authors: Marcelo de A. Borges, Guido L. de S. Filho, Cicero Inacio da Silva, Anderson M. P. Barros, Raul V. B. J. Britto, Nivaldo M. de C. Junior, Daniel F. L. de Souza

Abstract: This article describes a proposal to create a digital currency that allows the decentralized collection of resources directed to initiatives and activities that aim to protect the Brazilian Amazon ecosystem by using blockchain and digital contracts. In addition to the digital currency, the goal is to design a smart contract based in oracles to ensure credibility and security for investors and dono… ▽ More This article describes a proposal to create a digital currency that allows the decentralized collection of resources directed to initiatives and activities that aim to protect the Brazilian Amazon ecosystem by using blockchain and digital contracts. In addition to the digital currency, the goal is to design a smart contract based in oracles to ensure credibility and security for investors and donors of financial resources invested in projects within the Standing Forest Coin (SFC - standingforest.org). △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: in Portuguese

MSC Class: 58-04 ACM Class: J.7

arXiv:2112.13819 [pdf, other]

Trajectory Planning for Hybrid Unmanned Aerial Underwater Vehicles with Smooth Media Transition

Authors: Pedro Miranda Pinheiro, Armando Alves Neto, Ricardo Bedin Grando, Cesar Bastos da Silva, Vivian Misaki Aoki, Dayana Cardoso, Alexandre Campos Horn, Paulo Lilles Jorge Drews-Jr

Abstract: In the last decade, a great effort has been employed in the study of Hybrid Unmanned Aerial Underwater Vehicles, robots that can easily fly and dive into the water with different levels of mechanical adaptation. However, most of this literature is concentrated on physical design, practical issues of construction, and, more recently, low-level control strategies. Little has been done in the context… ▽ More In the last decade, a great effort has been employed in the study of Hybrid Unmanned Aerial Underwater Vehicles, robots that can easily fly and dive into the water with different levels of mechanical adaptation. However, most of this literature is concentrated on physical design, practical issues of construction, and, more recently, low-level control strategies. Little has been done in the context of high-level intelligence, such as motion planning and interactions with the real world. Therefore, we proposed in this paper a trajectory planning approach that allows collision avoidance against unknown obstacles and smooth transitions between aerial and aquatic media. Our method is based on a variant of the classic Rapidly-exploring Random Tree, whose main advantages are the capability to deal with obstacles, complex nonlinear dynamics, model uncertainties, and external disturbances. The approach uses the dynamic model of the \hydrone, a hybrid vehicle proposed with high underwater performance, but we believe it can be easily generalized to other types of aerial/aquatic platforms. In the experimental section, we present simulated results in environments filled with obstacles, where the robot is commanded to perform different media movements, demonstrating the applicability of our strategy. △ Less

Submitted 27 December, 2021; originally announced December 2021.

Comments: Accepted to the Journal of Intelligent & Robotic Systems

arXiv:2112.13721 [pdf, other]

Variational symplectic diagonally implicit Runge-Kutta methods for isospectral systems

Authors: Clauson Carvalho da Silva, Christian Lessig

Abstract: Isospectral flows appear in a variety of applications, e.g. the Toda lattice in solid state physics or in discrete models for two-dimensional hydrodynamics, with the isospectral property often corresponding to mathematically or physically important conservation laws. Their most prominent feature, i.e. the conservation of the eigenvalues of the matrix state variable, should therefore be retained wh… ▽ More Isospectral flows appear in a variety of applications, e.g. the Toda lattice in solid state physics or in discrete models for two-dimensional hydrodynamics, with the isospectral property often corresponding to mathematically or physically important conservation laws. Their most prominent feature, i.e. the conservation of the eigenvalues of the matrix state variable, should therefore be retained when discretizing these systems. Recently, it was shown how isospectral Runge-Kutta methods can, in the Lie-Poisson case also considered in our work, be obtained through Hamiltonian reduction of symplectic Runge-Kutta methods on the cotangent bundle of a Lie group. We provide the Lagrangian analogue and, in the case of symplectic diagonal implicit Runge-Kutta methods, derive the methods through a discrete Euler-Poincare reduction. Our derivation relies on a formulation of diagonally implicit isospectral Runge-Kutta methods in terms of the Cayley transform, generalizing earlier work that showed this for the implicit midpoint rule. Our work is also a generalization of earlier variational Lie group integrators that, interestingly, appear when these are interpreted as update equations for intermediate time points. From a practical point of view, our results allow for a simple implementation of higher order isospectral methods and we demonstrate this with numerical experiments where both the isospectral property and energy are conserved to high accuracy. △ Less

Submitted 27 December, 2021; originally announced December 2021.

MSC Class: 65L06; 65P10

arXiv:2108.12214 [pdf, other]

Machine Learning for Performance Prediction of Spark Cloud Applications

Authors: Alexandre Maros, Fabricio Murai, Ana Paula Couto da Silva, Jussara M. Almeida, Marco Lattuada, Eugenio Gianniti, Marjan Hosseini, Danilo Ardagna

Abstract: Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the und… ▽ More Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the underlying resources at runtime. Machine Learning (ML), providing black box solutions to model the relationship between application performance and system configuration without requiring in-detail knowledge of the system, has become a popular way of predicting the performance of big data applications. We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of today's most widely used frameworks for big data analysis. We compare our approach with \textit{Ernest} (an ML-based technique proposed in the literature by the Spark inventors) on a range of scenarios, application workloads, and cloud system configurations. Our experiments show that Ernest can accurately estimate the performance of very regular applications, but it fails when applications exhibit more irregular patterns and/or when extrapolating on bigger data set sizes. Results show that our models match or exceed Ernest's performance, sometimes enabling us to reduce the prediction error from 126-187% to only 5-19%. △ Less

Submitted 27 August, 2021; originally announced August 2021.

Comments: Published in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD)

ACM Class: B.8.2; I.2

arXiv:2105.12092 [pdf, other]

doi 10.1016/j.ins.2024.120128

Trajectory Modeling via Random Utility Inverse Reinforcement Learning

Authors: Anselmo R. Pitombeira-Neto, Helano P. Santos, Ticiana L. Coelho da Silva, José Antonio F. de Macedo

Abstract: We consider the problem of modeling trajectories of drivers in a road network from the perspective of inverse reinforcement learning. Cars are detected by sensors placed on sparsely distributed points on the street network of a city. As rational agents, drivers are trying to maximize some reward function unknown to an external observer. We apply the concept of random utility from econometrics to m… ▽ More We consider the problem of modeling trajectories of drivers in a road network from the perspective of inverse reinforcement learning. Cars are detected by sensors placed on sparsely distributed points on the street network of a city. As rational agents, drivers are trying to maximize some reward function unknown to an external observer. We apply the concept of random utility from econometrics to model the unknown reward function as a function of observed and unobserved features. In contrast to current inverse reinforcement learning approaches, we do not assume that agents act according to a stochastic policy; rather, we assume that agents act according to a deterministic optimal policy and show that randomness in data arises because the exact rewards are not fully observed by an external observer. We introduce the concept of extended state to cope with unobserved features and develop a Markov decision process formulation of drivers decisions. We present theoretical results which guarantee the existence of solutions and show that maximum entropy inverse reinforcement learning is a particular case of our approach. Finally, we illustrate Bayesian inference on model parameters through a case study with real trajectory data from a large city in Brazil. △ Less

Submitted 10 January, 2023; v1 submitted 25 May, 2021; originally announced May 2021.

Comments: 31 pages; expanded version, with the addition of proofs not present in the first version

arXiv:2105.09452 [pdf, other]

Minimum-Delay Adaptation in Non-Stationary Reinforcement Learning via Online High-Confidence Change-Point Detection

Authors: Lucas N. Alegre, Ana L. C. Bazzan, Bruno C. da Silva

Abstract: Non-stationary environments are challenging for reinforcement learning algorithms. If the state transition and/or reward functions change based on latent factors, the agent is effectively tasked with optimizing a behavior that maximizes performance over a possibly infinite random sequence of Markov Decision Processes (MDPs), each of which drawn from some unknown distribution. We call each such MDP… ▽ More Non-stationary environments are challenging for reinforcement learning algorithms. If the state transition and/or reward functions change based on latent factors, the agent is effectively tasked with optimizing a behavior that maximizes performance over a possibly infinite random sequence of Markov Decision Processes (MDPs), each of which drawn from some unknown distribution. We call each such MDP a context. Most related works make strong assumptions such as knowledge about the distribution over contexts, the existence of pre-training phases, or a priori knowledge about the number, sequence, or boundaries between contexts. We introduce an algorithm that efficiently learns policies in non-stationary environments. It analyzes a possibly infinite stream of data and computes, in real-time, high-confidence change-point detection statistics that reflect whether novel, specialized policies need to be created and deployed to tackle novel contexts, or whether previously-optimized ones might be reused. We show that (i) this algorithm minimizes the delay until unforeseen changes to a context are detected, thereby allowing for rapid responses; and (ii) it bounds the rate of false alarm, which is important in order to minimize regret. Our method constructs a mixture model composed of a (possibly infinite) ensemble of probabilistic dynamics predictors that model the different modes of the distribution over underlying latent MDPs. We evaluate our algorithm on high-dimensional continuous reinforcement learning problems and show that it outperforms state-of-the-art (model-free and model-based) RL algorithms, as well as state-of-the-art meta-learning methods specially designed to deal with non-stationarity. △ Less

Submitted 19 May, 2021; originally announced May 2021.

Comments: Published at Proc. of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021)

MSC Class: 68T05

Journal ref: Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems. 2021. 97-105

arXiv:2105.04137 [pdf, other]

doi 10.46298/dmtcs.7474

On the inversion number of oriented graphs

Authors: Jørgen Bang-Jensen, Jonas Costa Ferreira da Silva, Frédéric Havet

Abstract: Let $D$ be an oriented graph. The inversion of a set $X$ of vertices in $D$ consists in reversing the direction of all arcs with both ends in $X$. The inversion number of $D$, denoted by ${\rm inv}(D)$, is the minimum number of inversions needed to make $D$ acyclic. Denoting by $τ(D)$, $τ' (D)$, and $ν(D)$ the cycle transversal number, the cycle arc-transversal number and the cycle packing number… ▽ More Let $D$ be an oriented graph. The inversion of a set $X$ of vertices in $D$ consists in reversing the direction of all arcs with both ends in $X$. The inversion number of $D$, denoted by ${\rm inv}(D)$, is the minimum number of inversions needed to make $D$ acyclic. Denoting by $τ(D)$, $τ' (D)$, and $ν(D)$ the cycle transversal number, the cycle arc-transversal number and the cycle packing number of $D$ respectively, one shows that ${\rm inv}(D) \leq τ' (D)$, ${\rm inv}(D) \leq 2τ(D)$ and there exists a function $g$ such that ${\rm inv}(D)\leq g(ν(D))$. We conjecture that for any two oriented graphs $L$ and $R$, ${\rm inv}(L\rightarrow R) ={\rm inv}(L) +{\rm inv}(R)$ where $L\rightarrow R$ is the dijoin of $L$ and $R$. This would imply that the first two inequalities are tight. We prove this conjecture when ${\rm inv}(L)\leq 1$ and ${\rm inv}(R)\leq 2$ and when ${\rm inv}(L) ={\rm inv}(R)=2$ and $L$ and $R$ are strongly connected. We also show that the function $g$ of the third inequality satisfies $g(1)\leq 4$. We then consider the complexity of deciding whether ${\rm inv}(D)\leq k$ for a given oriented graph $D$. We show that it is NP-complete for $k=1$, which together with the above conjecture would imply that it is NP-complete for every $k$. This contrasts with a result of Belkhechine et al. which states that deciding whether ${\rm inv}(T)\leq k$ for a given tournament $T$ is polynomial-time solvable. △ Less

Submitted 18 December, 2022; v1 submitted 10 May, 2021; originally announced May 2021.

Journal ref: Discrete Mathematics & Theoretical Computer Science, vol. 23 no. 2, special issue in honour of Maurice Pouzet, Special issues (December 21, 2022) dmtcs:7474

arXiv:2105.04083 [pdf, other]

The Behavior of Internet Traffic for Internet Services during COVID-19 Pandemic Scenario

Authors: Carlos Alexandre Gouvea da Silva, Allan Christian Krainski Ferrari, Cristiano Osinski, Douglas Antonio Firmino Pelacini

Abstract: Since the end of 2019, the SARS-CoV-2 virus known as COVID-19 has spread rapidly around the world, forcing many governments to impose restrictive blocking or lockdown to combat the pandemic. With locomotion restriction of people in almost of countries of the world, workers and students needed to keep their activities at home. As a result, people's behavior, habits, and the way they started using t… ▽ More Since the end of 2019, the SARS-CoV-2 virus known as COVID-19 has spread rapidly around the world, forcing many governments to impose restrictive blocking or lockdown to combat the pandemic. With locomotion restriction of people in almost of countries of the world, workers and students needed to keep their activities at home. As a result, people's behavior, habits, and the way they started using the Internet changed significantly. Like professionals of offices, the younger played an important role in this behavior, especially in the type of resources used by them. As result, the characterization and traffic of communication networks were affected in some way. In this perspective article, we join from many available studies about the COVID-19 effect at networks and investigate the effects on the Internet traffic of using services such as video streaming, video conferencing, and gaming during 2020's months of the pandemic. △ Less

Submitted 9 May, 2021; originally announced May 2021.

Comments: 4 pages, 2 figures, Submitted to XXXIX Simpósio Brasileiro de Telecomunicações e Processamento de Sinais, SBrT 2021, Fortaleza, CE, Brasil

arXiv:2104.12820 [pdf, other]

Universal Off-Policy Evaluation

Authors: Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas

Abstract: When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy. Those predictions must often be based on data collected under some previously used decision-making rule. Many previous methods enable such off-policy (or counterfactual) estimation of the expected value of a performance measure called the return… ▽ More When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy. Those predictions must often be based on data collected under some previously used decision-making rule. Many previous methods enable such off-policy (or counterfactual) estimation of the expected value of a performance measure called the return. In this paper, we take the first steps towards a universal off-policy estimator (UnO) -- one that provides off-policy estimates and high-confidence bounds for any parameter of the return distribution. We use UnO for estimating and simultaneously bounding the mean, variance, quantiles/median, inter-quantile range, CVaR, and the entire cumulative distribution of returns. Finally, we also discuss Uno's applicability in various settings, including fully observable, partially observable (i.e., with unobserved confounders), Markovian, non-Markovian, stationary, smoothly non-stationary, and discrete distribution shifts. △ Less

Submitted 2 November, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: Accepted at Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

arXiv:2103.02383 [pdf, other]

doi 10.1016/j.ifacol.2021.10.328

Nonlinear MPC for Offset-Free Tracking of systems learned by GRU Neural Networks

Authors: Fabio Bonassi, C. F. Oliveira da Silva, Riccardo Scattolini

Abstract: The use of Recurrent Neural Networks (RNNs) for system identification has recently gathered increasing attention, thanks to their black-box modeling capabilities.Albeit RNNs have been fruitfully adopted in many applications, only few works are devoted to provide rigorous theoretical foundations that justify their use for control purposes. The aim of this paper is to describe how stable Gated Recur… ▽ More The use of Recurrent Neural Networks (RNNs) for system identification has recently gathered increasing attention, thanks to their black-box modeling capabilities.Albeit RNNs have been fruitfully adopted in many applications, only few works are devoted to provide rigorous theoretical foundations that justify their use for control purposes. The aim of this paper is to describe how stable Gated Recurrent Units (GRUs), a particular RNN architecture, can be trained and employed in a Nonlinear MPC framework to perform offset-free tracking of constant references with guaranteed closed-loop stability. The proposed approach is tested on a pH neutralization process benchmark, showing remarkable performances. △ Less

Submitted 25 January, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: This work is the extended version of the article accepted at the Third IFAC Conference on Modelling, Identification and Control of Nonlinear Systems (MICNON 2021) for publication under a Creative Commons Licence CC-BY-NC-ND

arXiv:2103.00535 [pdf, other]

A multi-objective time series analysis of community mobility reduction comparing first and second COVID-19 waves

Authors: Gabriela Cavalcante da Silva, Fernanda Monteiro de Almeida, Sabrina Oliveira, Leonardo C. T. Bezerra, Elizabeth F. Wanner, Ricardo H. C. Takahashi

Abstract: With the logistic challenges faced by most countries for the production, distribution, and application of vaccines for the novel coronavirus disease~(COVID-19), social distancing~(SD) remains the most tangible approach to mitigate the spread of the virus. To assist SD monitoring, several tech companies have made publicly available anonymized mobility data. In this work, we conduct a multi-objectiv… ▽ More With the logistic challenges faced by most countries for the production, distribution, and application of vaccines for the novel coronavirus disease~(COVID-19), social distancing~(SD) remains the most tangible approach to mitigate the spread of the virus. To assist SD monitoring, several tech companies have made publicly available anonymized mobility data. In this work, we conduct a multi-objective mobility reduction rate comparison between the first and second COVID-19 waves in several localities from America and Europe using Google community mobility reports~(CMR) data. Through multi-dimensional visualization, we are able to compare in a Pareto-compliant way the reduction in mobility from the different lockdown periods for each locality selected, simultaneously considering all place categories provided in CMR. In addition, our analysis comprises a 56-day lockdown period for each locality and COVID-19 wave, which we analyze both as 56-day periods and as 14-day consecutive windows. Results vary considerably as a function of the locality considered, particularly when the temporal evolution of the mobility reduction is considered. We thus discuss each locality individually, relating social distancing measures and the reduction observed. △ Less

Submitted 28 February, 2021; originally announced March 2021.

arXiv:2102.12970 [pdf, other]

doi 10.1007/s11761-020-00306-w

A microservice-based framework for exploring data selection in cross-building knowledge transfer

Authors: Mouna Labiadh, Christian Obrecht, Catarina Ferreira da Silva, Parisa Ghodous

Abstract: Supervised deep learning has achieved remarkable success in various applications. Successful machine learning application however depends on the availability of sufficiently large amount of data. In the absence of data from the target domain, representative data collection from multiple sources is often needed. However, a model trained on existing multi-source data might generalize poorly on the u… ▽ More Supervised deep learning has achieved remarkable success in various applications. Successful machine learning application however depends on the availability of sufficiently large amount of data. In the absence of data from the target domain, representative data collection from multiple sources is often needed. However, a model trained on existing multi-source data might generalize poorly on the unseen target domain. This problem is referred to as domain shift. In this paper, we explore the suitability of multi-source training data selection to tackle the domain shift challenge in the context of domain generalization. We also propose a microservice-oriented methodology for supporting this solution. We perform our experimental study on the use case of building energy consumption prediction. Experimental results suggest that minimal building description is capable of improving cross-building generalization performances when used to select energy consumption data. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: Service Oriented Computing and Applications, Springer, 2020

arXiv:2102.00790 [pdf, other]

Using a Cyber Digital Twin for Continuous Automotive Security Requirements Verification

Authors: Ana Cristina Franco da Silva, Stefan Wagner, Eddie Lazebnik, Eyal Traitel

Abstract: A Digital Twin (DT) is a digital representation of a physical object used to simulate it before it is built or to predict failures after the object is deployed. In this article, we introduce our approach, which applies the concept of a Cyber Digital Twin (CDT) to automotive software for the purpose of security analysis. In our approach, automotive firmware is transformed into a CDT, which contains… ▽ More A Digital Twin (DT) is a digital representation of a physical object used to simulate it before it is built or to predict failures after the object is deployed. In this article, we introduce our approach, which applies the concept of a Cyber Digital Twin (CDT) to automotive software for the purpose of security analysis. In our approach, automotive firmware is transformed into a CDT, which contains automatically extracted, security-relevant information from the firmware. Based on the CDT, we evaluate security requirements through automated analysis and requirements verification using policy enforcement checks and vulnerabilities detection. The evaluation of a CDT is conducted continuously integrating new checks derived from new security requirements and from newly disclosed vulnerabilities. We applied our approach to about 100 automotive firmwares. In average, about 600 publicly disclosed vulnerabilities and 80 unknown weaknesses were detected per firmware in the pre-production phase. Therefore, the use of a CDT enables efficient continuous verification of security requirements. △ Less

Submitted 30 September, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

Comments: 10 pages, 1 figure

ACM Class: D.2.5

arXiv:2011.13847 [pdf, other]

Autonomous learning of multiple, context-dependent tasks

Authors: Vieri Giuliano Santucci, Davide Montella, Bruno Castro da Silva, Gianluca Baldassarre

Abstract: When facing the problem of autonomously learning multiple tasks with reinforcement learning systems, researchers typically focus on solutions where just one parametrised policy per task is sufficient to solve them. However, in complex environments presenting different contexts, the same task might need a set of different skills to be solved. These situations pose two challenges: (a) to recognise t… ▽ More When facing the problem of autonomously learning multiple tasks with reinforcement learning systems, researchers typically focus on solutions where just one parametrised policy per task is sufficient to solve them. However, in complex environments presenting different contexts, the same task might need a set of different skills to be solved. These situations pose two challenges: (a) to recognise the different contexts that need different policies; (b) quickly learn the policies to accomplish the same tasks in the new discovered contexts. These two challenges are even harder if faced within an open-ended learning framework where an agent has to autonomously discover the goals that it might accomplish in a given environment, and also to learn the motor skills to accomplish them. We propose a novel open-ended learning robot architecture, C-GRAIL, that solves the two challenges in an integrated fashion. In particular, the architecture is able to detect new relevant contests, and ignore irrelevant ones, on the basis of the decrease of the expected performance for a given goal. Moreover, the architecture can quickly learn the policies for the new contexts by exploiting transfer learning importing knowledge from already acquired policies. The architecture is tested in a simulated robotic environment involving a robot that autonomously learns to reach relevant target objects in the presence of multiple obstacles generating several different obstacles. The proposed architecture outperforms other models not using the proposed autonomous context-discovery and transfer-learning mechanisms. △ Less

Submitted 27 November, 2020; originally announced November 2020.

arXiv:2009.10648 [pdf, other]

Google COVID-19 community mobility reports: insights from multi-criteria decision making

Authors: Gabriela Cavalcante da Silvaa, Sabrina Oliveirab, Elizabeth F. Wanner, Leonardo C. T. Bezerra

Abstract: Social distancing (SD) has been critical in the fight against the novel coronavirus disease (COVID-19). To aid SD monitoring, many technology companies have made available mobility data, the most prominent example being the community mobility reports (CMR) provided by Google. Given the wide range of research fields that have been drawing insights from CMR data, there has been a rising concern for… ▽ More Social distancing (SD) has been critical in the fight against the novel coronavirus disease (COVID-19). To aid SD monitoring, many technology companies have made available mobility data, the most prominent example being the community mobility reports (CMR) provided by Google. Given the wide range of research fields that have been drawing insights from CMR data, there has been a rising concern for methodological discussion on how to use them. Indeed, Google recently released their own guidelines, concerning the nature of the place categories and the need for calibrating regional values. In this work, we discuss how measures developed in the field of multi-criteria decision making (MCDM) might benefit researchers analyzing this data. Concretely, we discuss how Pareto dominance and performance measures adopted in MCDM enable the mobility evaluation for (i) multiple categories for a given time period and (ii) multiple categories over multiple time periods. We empirically demonstrate these approaches conducting both a region- and country-level analysis, comparing some of the most relevant outbreak examples from different continents. △ Less

Submitted 17 September, 2020; originally announced September 2020.

arXiv:2006.15401 [pdf, other]

You Shall not Pass: Avoiding Spurious Paths in Shortest-Path Based Centralities in Multidimensional Complex Networks

Authors: Klaus Wehmuth, Artur Ziviani, Leonardo Chinelate Costa, Ana Paula Couto da Silva, Alex Borges Vieira

Abstract: In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view… ▽ More In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view of such multidimensional (high order) networks. Consequently, these spurious paths may then cause shortest-path based centrality metrics to produce incorrect results, thus undermining the network centrality analysis. In this context, we propose a method able to avoid taking into account spurious paths when computing centralities based on shortest paths in multidimensional (or high order) networks. Our method is based on MultiAspect Graphs~(MAG) to represent the multidimensional networks and we show that well-known centrality algorithms can be straightforwardly adapted to the MAG environment. Moreover, we show that, by using this MAG representation, pitfalls usually associated with spurious paths resulting from aggregation in multidimensional networks can be avoided at the time of the aggregation process. As a result, shortest-path based centralities are assured to be computed correctly for multidimensional networks, without taking into account spurious paths that could otherwise lead to incorrect results. We also present a case study that shows the impact of spurious paths in the computing of shortest paths and consequently of shortest-path based centralities, such as betweenness and closeness, thus illustrating the importance of this contribution. △ Less

Submitted 19 August, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

Comments: 17 pages, 6 figures

arXiv:2006.13432 [pdf, other]

Local-Search Based Heuristics for Advertisement Scheduling

Authors: Mauro R. C. da Silva, Rafael C. S. Schouery

Abstract: In the MAXSPACE problem, given a set of ads A, one wants to place a subset A' of A into K slots B_1, ..., B_K of size L. Each ad A_i in A has size s_i and frequency w_i. A schedule is feasible if the total size of ads in any slot is at most L, and each ad A_i in A' appears in exactly w_i slots. The goal is to find a feasible schedule that maximizes the space occupied in all slots. We introduce MAX… ▽ More In the MAXSPACE problem, given a set of ads A, one wants to place a subset A' of A into K slots B_1, ..., B_K of size L. Each ad A_i in A has size s_i and frequency w_i. A schedule is feasible if the total size of ads in any slot is at most L, and each ad A_i in A' appears in exactly w_i slots. The goal is to find a feasible schedule that maximizes the space occupied in all slots. We introduce MAXSPACE-RDWV, a MAXSPACE generalization with release dates, deadlines, variable frequency, and generalized profit. In MAXSPACE-RDWV each ad A_i has a release date r_i >= 1, a deadline d_i >= r_i, a profit v_i that may not be related with s_i and lower and upper bounds w^min_i and w^max_i for frequency. In this problem, an ad may only appear in a slot B_j with r_i <= j <= d_i, and the goal is to find a feasible schedule that maximizes the sum of values of scheduled ads. This paper presents some algorithms based on meta-heuristics GRASP, VNS, Local Search, and Tabu Search for MAXSPACE and MAXSPACE-RDWV. We compare our proposed algorithms with Hybrid-GA proposed by Kumar et al. (2006). We also create a version of Hybrid-GA for MAXSPACE-RDWV and compare it with our meta-heuristics. Some meta-heuristics, such as VNS and GRASP+VNS, have better results than Hybrid-GA for both problems. In our heuristics, we apply a technique that alternates between maximizing and minimizing the fullness of slots to obtain better solutions. We also applied a data structure called BIT to the neighborhood computation in MAXSPACE-RDWV and showed that this enabled ours algorithms to run more iterations. △ Less

Submitted 16 September, 2022; v1 submitted 23 June, 2020; originally announced June 2020.

arXiv:2006.13430 [pdf, other]

Approximation algorithms for the MAXSPACE advertisement problem

Authors: Mauro R. C. da Silva, Lehilton L. C. Pedrosa, Rafael C. S. Schouery

Abstract: $\newcommand{\cala}{\mathcal{A}}$ In MAXSPACE, given a set of ads $\cala$, one wants to schedule a subset ${\cala'\subseteq\cala}$ into $K$ slots ${B_1, \dots, B_K}$ of size $L$. Each ad ${A_i \in \cala}$ has a size $s_i$ and a frequency $w_i$. A schedule is feasible if the total size of ads in any slot is at most $L$, and each ad ${A_i \in \cala'}$ appears in exactly $w_i… ▽ More $\newcommand{\cala}{\mathcal{A}}$ In MAXSPACE, given a set of ads $\cala$, one wants to schedule a subset ${\cala'\subseteq\cala}$ into $K$ slots ${B_1, \dots, B_K}$ of size $L$. Each ad ${A_i \in \cala}$ has a size $s_i$ and a frequency $w_i$. A schedule is feasible if the total size of ads in any slot is at most $L$, and each ad ${A_i \in \cala'}$ appears in exactly $w_i$ slots and at most once per slot. The goal is to find a feasible schedule that maximizes the sum of the space occupied by all slots. We consider a generalization called MAXSPACE-R for which an ad $A_i$ also has a release date $r_i$ and may only appear in a slot $B_j$ if ${j \ge r_i}$. For this variant, we give a $1/9$-approximation algorithm. Furthermore, we consider MAXSPACE-RDV for which an ad $A_i$ also has a deadline $d_i$ (and may only appear in a slot $B_j$ with $r_i \le j \le d_i$), and a value $v_i$ that is the gain of each assigned copy of $A_i$ (which can be unrelated to $s_i$). We present a polynomial-time approximation scheme for this problem when $K$ is bounded by a constant. This is the best factor one can expect since MAXSPACE is strongly NP-hard, even if $K = 2$. △ Less

Submitted 8 May, 2023; v1 submitted 23 June, 2020; originally announced June 2020.

arXiv:2006.05514 [pdf, other]

A Machine Learning Early Warning System: Multicenter Validation in Brazilian Hospitals

Authors: Jhonatan Kobylarz, Henrique D. P. dos Santos, Felipe Barletta, Mateus Cichelero da Silva, Renata Vieira, Hugo M. P. Morales, Cristian da Costa Rocha

Abstract: Early recognition of clinical deterioration is one of the main steps for reducing inpatient morbidity and mortality. The challenging task of clinical deterioration identification in hospitals lies in the intense daily routines of healthcare practitioners, in the unconnected patient data stored in the Electronic Health Records (EHRs) and in the usage of low accuracy scores. Since hospital wards are… ▽ More Early recognition of clinical deterioration is one of the main steps for reducing inpatient morbidity and mortality. The challenging task of clinical deterioration identification in hospitals lies in the intense daily routines of healthcare practitioners, in the unconnected patient data stored in the Electronic Health Records (EHRs) and in the usage of low accuracy scores. Since hospital wards are given less attention compared to the Intensive Care Unit, ICU, we hypothesized that when a platform is connected to a stream of EHR, there would be a drastic improvement in dangerous situations awareness and could thus assist the healthcare team. With the application of machine learning, the system is capable to consider all patient's history and through the use of high-performing predictive models, an intelligent early warning system is enabled. In this work we used 121,089 medical encounters from six different hospitals and 7,540,389 data points, and we compared popular ward protocols with six different scalable machine learning methods (three are classic machine learning models, logistic and probabilistic-based models, and three gradient boosted models). The results showed an advantage in AUC (Area Under the Receiver Operating Characteristic Curve) of 25 percentage points in the best Machine Learning model result compared to the current state-of-the-art protocols. This is shown by the generalization of the algorithm with leave-one-group-out (AUC of 0.949) and the robustness through cross-validation (AUC of 0.961). We also perform experiments to compare several window sizes to justify the use of five patient timestamps. A sample dataset, experiments, and code are available for replicability purposes. △ Less

Submitted 9 June, 2020; originally announced June 2020.

Comments: Paper accepted by IEEE 33rd International Symposium on Computer Based Medical Systems (CBMS) 2020

MSC Class: 68T42 ACM Class: I.2.1

Showing 1–50 of 83 results for author: da Silva, C