-
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA Moderna -- "The New Electricity": Applications, Risks, and Trends in Current AI
Authors:
Ana L. C. Bazzan,
Anderson R. Tavares,
André G. Pereira,
Cláudio R. Jung,
Jacob Scharcanski,
Joel Luis Carbonera,
Luís C. Lamb,
Mariana Recamonde-Mendoza,
Thiago L. T. da Silveira,
Viviane Moreira
Abstract:
The thought-provoking analogy between AI and electricity, made by computer scientist and entrepreneur Andrew Ng, summarizes the deep transformation that recent advances in Artificial Intelligence (AI) have triggered in the world. This chapter presents an overview of the ever-evolving landscape of AI, written in Portuguese. With no intent to exhaust the subject, we explore the AI applications that…
▽ More
The thought-provoking analogy between AI and electricity, made by computer scientist and entrepreneur Andrew Ng, summarizes the deep transformation that recent advances in Artificial Intelligence (AI) have triggered in the world. This chapter presents an overview of the ever-evolving landscape of AI, written in Portuguese. With no intent to exhaust the subject, we explore the AI applications that are redefining sectors of the economy, impacting society and humanity. We analyze the risks that may come along with rapid technological progress and future trends in AI, an area that is on the path to becoming a general-purpose technology, just like electricity, which revolutionized society in the 19th and 20th centuries.
A provocativa comparação entre IA e eletricidade, feita pelo cientista da computação e empreendedor Andrew Ng, resume a profunda transformação que os recentes avanços em Inteligência Artificial (IA) têm desencadeado no mundo. Este capítulo apresenta uma visão geral pela paisagem em constante evolução da IA. Sem pretensões de exaurir o assunto, exploramos as aplicações que estão redefinindo setores da economia, impactando a sociedade e a humanidade. Analisamos os riscos que acompanham o rápido progresso tecnológico e as tendências futuras da IA, área que trilha o caminho para se tornar uma tecnologia de propósito geral, assim como a eletricidade, que revolucionou a sociedade dos séculos XIX e XX.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization
Authors:
Lucas N. Alegre,
Ana L. C. Bazzan,
Diederik M. Roijers,
Ann Nowé,
Bruno C. da Silva
Abstract:
Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each optimized for a particular agent preference) that can later be used to solve problems with novel preferences. We introduce a novel algorithm that uses Generalized Po…
▽ More
Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each optimized for a particular agent preference) that can later be used to solve problems with novel preferences. We introduce a novel algorithm that uses Generalized Policy Improvement (GPI) to define principled, formally-derived prioritization schemes that improve sample-efficient learning. They implement active-learning strategies by which the agent can (i) identify the most promising preferences/objectives to train on at each moment, to more rapidly solve a given MORL problem; and (ii) identify which previous experiences are most relevant when learning a policy for a particular agent preference, via a novel Dyna-style MORL method. We prove our algorithm is guaranteed to always converge to an optimal solution in a finite number of steps, or an $ε$-optimal solution (for a bounded $ε$) if the agent is limited and can only identify possibly sub-optimal policies. We also prove that our method monotonically improves the quality of its partial solutions while learning. Finally, we introduce a bound that characterizes the maximum utility loss (with respect to the optimal solution) incurred by the partial solutions computed by our method throughout learning. We empirically show that our method outperforms state-of-the-art MORL algorithms in challenging multi-objective tasks, both with discrete and continuous state and action spaces.
△ Less
Submitted 23 March, 2023; v1 submitted 18 January, 2023;
originally announced January 2023.
-
Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer
Authors:
Lucas N. Alegre,
Ana L. C. Bazzan,
Bruno C. da Silva
Abstract:
In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set of policies for different tasks, successor features (SFs) can be exploited to combine such policies and identify reasonable solutions for new problems. However…
▽ More
In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set of policies for different tasks, successor features (SFs) can be exploited to combine such policies and identify reasonable solutions for new problems. However, the identified solutions are not guaranteed to be optimal. We introduce a novel algorithm that addresses this limitation. It allows RL agents to combine existing policies and directly identify optimal policies for arbitrary new problems, without requiring any further interactions with the environment. We first show (under mild assumptions) that the transfer learning problem tackled by SFs is equivalent to the problem of learning to optimize multiple objectives in RL. We then introduce an SF-based extension of the Optimistic Linear Support algorithm to learn a set of policies whose SFs form a convex coverage set. We prove that policies in this set can be combined via generalized policy improvement to construct optimal behaviors for any new linearly-expressible tasks, without requiring any additional training samples. We empirically show that our method outperforms state-of-the-art competing algorithms both in discrete and continuous domains under value function approximation.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Improving Urban Mobility: using artificial intelligence and new technologies to connect supply and demand
Authors:
Ana L. C. Bazzan
Abstract:
As the demand for mobility in our society seems to increase, the various issues centered on urban mobility are among those that worry most city inhabitants in this planet. For instance, how to go from A to B in an efficient (but also less stressful) way? These questions and concerns have not changed even during the covid-19 pandemic; on the contrary, as the current stand, people who are avoiding p…
▽ More
As the demand for mobility in our society seems to increase, the various issues centered on urban mobility are among those that worry most city inhabitants in this planet. For instance, how to go from A to B in an efficient (but also less stressful) way? These questions and concerns have not changed even during the covid-19 pandemic; on the contrary, as the current stand, people who are avoiding public transportation are only contributing to an increase in the vehicular traffic. The are of intelligent transportation systems (ITS) aims at investigating how to employ information and communication technologies to problems related to transportation. This may mean monitoring and managing the infrastructure (e.g., traffic roads, traffic signals, etc.). However, currently, ITS is also targeting the management of demand. In this panorama, artificial intelligence plays an important role, especially with the advances in machine learning that translates in the use of computational vision, connected and autonomous vehicles, agent-based simulation, among others. In the present work, a survey of several works developed by our group are discussed in a holistic perspective, i.e., they cover not only the supply side (as commonly found in ITS works), but also the demand side, and, in an novel perspective, the integration of both.
△ Less
Submitted 18 March, 2022;
originally announced April 2022.
-
Minimum-Delay Adaptation in Non-Stationary Reinforcement Learning via Online High-Confidence Change-Point Detection
Authors:
Lucas N. Alegre,
Ana L. C. Bazzan,
Bruno C. da Silva
Abstract:
Non-stationary environments are challenging for reinforcement learning algorithms. If the state transition and/or reward functions change based on latent factors, the agent is effectively tasked with optimizing a behavior that maximizes performance over a possibly infinite random sequence of Markov Decision Processes (MDPs), each of which drawn from some unknown distribution. We call each such MDP…
▽ More
Non-stationary environments are challenging for reinforcement learning algorithms. If the state transition and/or reward functions change based on latent factors, the agent is effectively tasked with optimizing a behavior that maximizes performance over a possibly infinite random sequence of Markov Decision Processes (MDPs), each of which drawn from some unknown distribution. We call each such MDP a context. Most related works make strong assumptions such as knowledge about the distribution over contexts, the existence of pre-training phases, or a priori knowledge about the number, sequence, or boundaries between contexts. We introduce an algorithm that efficiently learns policies in non-stationary environments. It analyzes a possibly infinite stream of data and computes, in real-time, high-confidence change-point detection statistics that reflect whether novel, specialized policies need to be created and deployed to tackle novel contexts, or whether previously-optimized ones might be reused. We show that (i) this algorithm minimizes the delay until unforeseen changes to a context are detected, thereby allowing for rapid responses; and (ii) it bounds the rate of false alarm, which is important in order to minimize regret. Our method constructs a mixture model composed of a (possibly infinite) ensemble of probabilistic dynamics predictors that model the different modes of the distribution over underlying latent MDPs. We evaluate our algorithm on high-dimensional continuous reinforcement learning problems and show that it outperforms state-of-the-art (model-free and model-based) RL algorithms, as well as state-of-the-art meta-learning methods specially designed to deal with non-stationarity.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Quantitatively Assessing the Benefits of Model-driven Development in Agent-based Modeling and Simulation
Authors:
Fernando Santos,
Ingrid Nunes,
Ana L. C. Bazzan
Abstract:
The agent-based modeling and simulation (ABMS) paradigm has been used to analyze, reproduce, and predict phenomena related to many application areas. Although there are many agent-based platforms that support simulation development, they rely on programming languages that require extensive programming knowledge. Model-driven development (MDD) has been explored to facilitate simulation modeling, by…
▽ More
The agent-based modeling and simulation (ABMS) paradigm has been used to analyze, reproduce, and predict phenomena related to many application areas. Although there are many agent-based platforms that support simulation development, they rely on programming languages that require extensive programming knowledge. Model-driven development (MDD) has been explored to facilitate simulation modeling, by means of high-level modeling languages that provide reusable building blocks that hide computational complexity, and code generation. However, there is still limited knowledge of how MDD approaches to ABMS contribute to increasing development productivity and quality. We thus in this paper present an empirical study that quantitatively compares the use of MDD and ABMS platforms mainly in terms of effort and developer mistakes. Our evaluation was performed using MDD4ABMS-an MDD approach with a core and extensions to two application areas, one of which developed for this study-and NetLogo, a widely used platform. The obtained results show that MDD4ABMS requires less effort to develop simulations with similar (sometimes better) design quality than NetLogo, giving evidence of the benefits that MDD can provide to ABMS.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
Quantifying the Impact of Non-Stationarity in Reinforcement Learning-Based Traffic Signal Control
Authors:
Lucas N. Alegre,
Ana L. C. Bazzan,
Bruno C. da Silva
Abstract:
In reinforcement learning (RL), dealing with non-stationarity is a challenging issue. However, some domains such as traffic optimization are inherently non-stationary. Causes for and effects of this are manifold. In particular, when dealing with traffic signal controls, addressing non-stationarity is key since traffic conditions change over time and as a function of traffic control decisions taken…
▽ More
In reinforcement learning (RL), dealing with non-stationarity is a challenging issue. However, some domains such as traffic optimization are inherently non-stationary. Causes for and effects of this are manifold. In particular, when dealing with traffic signal controls, addressing non-stationarity is key since traffic conditions change over time and as a function of traffic control decisions taken in other parts of a network. In this paper we analyze the effects that different sources of non-stationarity have in a network of traffic signals, in which each signal is modeled as a learning agent. More precisely, we study both the effects of changing the \textit{context} in which an agent learns (e.g., a change in flow rates experienced by it), as well as the effects of reducing agent observability of the true environment state. Partial observability may cause distinct states (in which distinct actions are optimal) to be seen as the same by the traffic signal agents. This, in turn, may lead to sub-optimal performance. We show that the lack of suitable sensors to provide a representative observation of the real state seems to affect the performance more drastically than the changes to the underlying traffic patterns.
△ Less
Submitted 9 April, 2020;
originally announced April 2020.
-
I will be there for you: six friends in a clique
Authors:
Ana L. C. Bazzan
Abstract:
Network science has proved useful in analyzing structure and dynamics of social networks in several areas. This paper aims at analyzing the relationships of characters in the sitcom Friends. In particular, two important aspects are investigated. First, how are the structure of the communities. Second, not only static structure of the graphs and causality relationships are investigated, but also te…
▽ More
Network science has proved useful in analyzing structure and dynamics of social networks in several areas. This paper aims at analyzing the relationships of characters in the sitcom Friends. In particular, two important aspects are investigated. First, how are the structure of the communities. Second, not only static structure of the graphs and causality relationships are investigated, but also temporal aspects. Also, this sitcom is frequently associated with distinguishing facts such as: all six characters are equally prominent; it has no dominant storyline; and friendship as surrogate family. This paper uses tools from network theory to check whether these and other assumptions can be quantified and proved correct. The main findings regarding the centrality and temporal aspects are: patterns in graphs representing different time slices of the show change; overall, degrees of the six friends are indeed nearly the same; however, in different situations (thus graphs), the magnitudes of degree centrality do change; betweenness centrality differs significantly for each character thus some characters are better connectors than others; there is a high difference regarding degrees of the six friends versus the rest of the characters, which points to a centralized network; there are strong indications that the six friends are part of a surrogate family. As for the presence of groups within the network, methods of different natures were investigated aiming at detecting groups (communities) in networks representing different time slices as well as the network of all episodes. Such methods were compared (pairwise and also using various metrics, including plausibility). The multilevel method performs reasonably in general. Also, it stands out that those methods do not agree very much, resulting in groups that are very different from method to method.
△ Less
Submitted 11 December, 2019; v1 submitted 12 April, 2018;
originally announced April 2018.
-
Community Detection in the Network of German Princes in 1225: a Case Study
Authors:
Silvio R. Dahmen,
A. L. C. Bazzan,
R. Gramsch
Abstract:
Many social networks exhibit some underlying community structure. In particular, in the context of historical research, clustering of different groups into warring or friendly factions can lead to a better understanding of how conflicts may arise, and whether they could be avoided or not. In this work we study the crisis that started in 1225 when the Emperor of the Holy Roman Empire, Frederick II…
▽ More
Many social networks exhibit some underlying community structure. In particular, in the context of historical research, clustering of different groups into warring or friendly factions can lead to a better understanding of how conflicts may arise, and whether they could be avoided or not. In this work we study the crisis that started in 1225 when the Emperor of the Holy Roman Empire, Frederick II and his son Henry VII got into a conflict which almost led to the rupture and dissolution of the Empire. We use a spin-glass-based community detection algorithm to see how good this method is in detecting this rift and compare the results with an analysis performed by one of the authors (Gramsch) using standard social balance theory applied to History.
△ Less
Submitted 5 January, 2017;
originally announced January 2017.
-
Temporal Network Analysis of Literary Texts
Authors:
Sandra D. Prado,
Silvio R. Dahmen,
Ana L. C. Bazzan,
Padraig Mac Carron,
Ralph Kenna
Abstract:
We study temporal networks of characters in literature focusing on "Alice's Adventures in Wonderland" (1865) by Lewis Carroll and the anonymous "La Chanson de Roland" (around 1100). The former, one of the most influential pieces of nonsense literature ever written, describes the adventures of Alice in a fantasy world with logic plays interspersed along the narrative. The latter, a song of heroic d…
▽ More
We study temporal networks of characters in literature focusing on "Alice's Adventures in Wonderland" (1865) by Lewis Carroll and the anonymous "La Chanson de Roland" (around 1100). The former, one of the most influential pieces of nonsense literature ever written, describes the adventures of Alice in a fantasy world with logic plays interspersed along the narrative. The latter, a song of heroic deeds, depicts the Battle of Roncevaux in 778 A.D. during Charlemagne's campaign on the Iberian Peninsula. We apply methods recently developed by Taylor and coworkers \cite{Taylor+2015} to find time-averaged eigenvector centralities, Freeman indices and vitalities of characters. We show that temporal networks are more appropriate than static ones for studying stories, as they capture features that the time-independent approaches fail to yield.
△ Less
Submitted 22 February, 2016;
originally announced February 2016.