-
Effective Layer Pruning Through Similarity Metric Perspective
Authors:
Ian Pons,
Bruno Yamamoto,
Anna H. Reali Costa,
Artur Jordao
Abstract:
Deep neural networks have been the predominant paradigm in machine learning for solving cognitive tasks. Such models, however, are restricted by a high computational overhead, limiting their applicability and hindering advancements in the field. Extensive research demonstrated that pruning structures from these models is a straightforward approach to reducing network complexity. In this direction,…
▽ More
Deep neural networks have been the predominant paradigm in machine learning for solving cognitive tasks. Such models, however, are restricted by a high computational overhead, limiting their applicability and hindering advancements in the field. Extensive research demonstrated that pruning structures from these models is a straightforward approach to reducing network complexity. In this direction, most efforts focus on removing weights or filters. Studies have also been devoted to layer pruning as it promotes superior computational gains. However, layer pruning often hurts the network predictive ability (i.e., accuracy) at high compression rates. This work introduces an effective layer-pruning strategy that meets all underlying properties pursued by pruning methods. Our method estimates the relative importance of a layer using the Centered Kernel Alignment (CKA) metric, employed to measure the similarity between the representations of the unpruned model and a candidate layer for pruning. We confirm the effectiveness of our method on standard architectures and benchmarks, in which it outperforms existing layer-pruning strategies and other state-of-the-art pruning techniques. Particularly, we remove more than 75% of computation while improving predictive ability. At higher compression regimes, our method exhibits negligible accuracy drop, while other methods notably deteriorate model accuracy. Apart from these benefits, our pruned models exhibit robustness to adversarial and out-of-distribution samples.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
From Random to Informed Data Selection: A Diversity-Based Approach to Optimize Human Annotation and Few-Shot Learning
Authors:
Alexandre Alcoforado,
Thomas Palmeira Ferraz,
Lucas Hideki Okamura,
Israel Campos Fama,
Arnold Moya Lavado,
Bárbara Dias Bueno,
Bruno Veloso,
Anna Helena Reali Costa
Abstract:
A major challenge in Natural Language Processing is obtaining annotated data for supervised learning. An option is the use of crowdsourcing platforms for data annotation. However, crowdsourcing introduces issues related to the annotator's experience, consistency, and biases. An alternative is to use zero-shot methods, which in turn have limitations compared to their few-shot or fully supervised co…
▽ More
A major challenge in Natural Language Processing is obtaining annotated data for supervised learning. An option is the use of crowdsourcing platforms for data annotation. However, crowdsourcing introduces issues related to the annotator's experience, consistency, and biases. An alternative is to use zero-shot methods, which in turn have limitations compared to their few-shot or fully supervised counterparts. Recent advancements driven by large language models show potential, but struggle to adapt to specialized domains with severely limited data. The most common approaches therefore involve the human itself randomly annotating a set of datapoints to build initial datasets. But randomly sampling data to be annotated is often inefficient as it ignores the characteristics of the data and the specific needs of the model. The situation worsens when working with imbalanced datasets, as random sampling tends to heavily bias towards the majority classes, leading to excessive annotated data. To address these issues, this paper contributes an automatic and informed data selection architecture to build a small dataset for few-shot learning. Our proposal minimizes the quantity and maximizes diversity of data selected for human annotation, while improving model performance.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Benchmarks for Pirá 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change
Authors:
Paulo Pirozelli,
Marcos M. José,
Igor Silveira,
Flávio Nakasato,
Sarajane M. Peres,
Anarosa A. F. Brandão,
Anna H. R. Costa,
Fabio G. Cozman
Abstract:
Pirá is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of basel…
▽ More
Pirá is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of baselines has not yet been developed for Pirá. By creating these baselines, researchers can more easily utilize Pirá as a resource for testing machine learning models across a wide range of question answering tasks. In this paper, we define six benchmarks over the Pirá dataset, covering closed generative question answering, machine reading comprehension, information retrieval, open question answering, answer triggering, and multiple choice question answering. As part of this effort, we have also produced a curated version of the original dataset, where we fixed a number of grammar issues, repetitions, and other shortcomings. Furthermore, the dataset has been extended in several new directions, so as to face the aforementioned benchmarks: translation of supporting texts from English into Portuguese, classification labels for answerability, automatic paraphrases of questions and answers, and multiple choice candidates. The results described in this paper provide several points of reference for researchers interested in exploring the challenges provided by the Pirá dataset.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Augmenting a Physics-Informed Neural Network for the 2D Burgers Equation by Addition of Solution Data Points
Authors:
Marlon Sproesser Mathias,
Wesley Pereira de Almeida,
Marcel Rodrigues de Barros,
Jefferson Fialho Coelho,
Lucas Palmiro de Freitas,
Felipe Marino Moreno,
Caio Fabricio Deberaldini Netto,
Fabio Gagliardi Cozman,
Anna Helena Reali Costa,
Eduardo Aoun Tannuri,
Edson Satoshi Gomi,
Marcelo Dottori
Abstract:
We implement a Physics-Informed Neural Network (PINN) for solving the two-dimensional Burgers equations. This type of model can be trained with no previous knowledge of the solution; instead, it relies on evaluating the governing equations of the system in points of the physical domain. It is also possible to use points with a known solution during training. In this paper, we compare PINNs trained…
▽ More
We implement a Physics-Informed Neural Network (PINN) for solving the two-dimensional Burgers equations. This type of model can be trained with no previous knowledge of the solution; instead, it relies on evaluating the governing equations of the system in points of the physical domain. It is also possible to use points with a known solution during training. In this paper, we compare PINNs trained with different amounts of governing equation evaluation points and known solution points. Comparing models that were trained purely with known solution points to those that have also used the governing equations, we observe an improvement in the overall observance of the underlying physics in the latter. We also investigate how changing the number of each type of point affects the resulting models differently. Finally, we argue that the addition of the governing equations during training may provide a way to improve the overall performance of the model without relying on additional data, which is especially important for situations where the number of known solution points is limited.
△ Less
Submitted 18 January, 2023;
originally announced January 2023.
-
A Physics-Informed Neural Network to Model Port Channels
Authors:
Marlon S. Mathias,
Marcel R. de Barros,
Jefferson F. Coelho,
Lucas P. de Freitas,
Felipe M. Moreno,
Caio F. D. Netto,
Fabio G. Cozman,
Anna H. R. Costa,
Eduardo A. Tannuri,
Edson S. Gomi,
Marcelo Dottori
Abstract:
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - São Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the gover…
▽ More
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - São Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Reinforcement Learning Applied to Trading Systems: A Survey
Authors:
Leonardo Kanashiro Felizardo,
Francisco Caio Lima Paiva,
Anna Helena Reali Costa,
Emilio Del-Moral-Hernandez
Abstract:
Financial domain tasks, such as trading in market exchanges, are challenging and have long attracted researchers. The recent achievements and the consequent notoriety of Reinforcement Learning (RL) have also increased its adoption in trading tasks. RL uses a framework with well-established formal concepts, which raises its attractiveness in learning profitable trading strategies. However, RL use w…
▽ More
Financial domain tasks, such as trading in market exchanges, are challenging and have long attracted researchers. The recent achievements and the consequent notoriety of Reinforcement Learning (RL) have also increased its adoption in trading tasks. RL uses a framework with well-established formal concepts, which raises its attractiveness in learning profitable trading strategies. However, RL use without due attention in the financial area can prevent new researchers from following standards or failing to adopt relevant conceptual guidelines. In this work, we embrace the seminal RL technical fundamentals, concepts, and recommendations to perform a unified, theoretically-grounded examination and comparison of previous research that could serve as a structuring guide for the field of study. A selection of twenty-nine articles was reviewed under our classification that considers RL's most common formulations and design patterns from a large volume of available studies. This classification allowed for precise inspection of the most relevant aspects regarding data input, preprocessing, state and action composition, adopted RL techniques, evaluation setups, and overall results. Our analysis approach organized around fundamental RL concepts allowed for a clear identification of current system design best practices, gaps that require further investigation, and promising research opportunities. Finally, this review attempts to promote the development of this field of study by facilitating researchers' commitment to standards adherence and helping them to avoid straying away from the RL constructs' firm ground.
△ Less
Submitted 1 November, 2022;
originally announced December 2022.
-
The BLue Amazon Brain (BLAB): A Modular Architecture of Services about the Brazilian Maritime Territory
Authors:
Paulo Pirozelli,
Ais B. R. Castro,
Ana Luiza C. de Oliveira,
André S. Oliveira,
Flávio N. Cação,
Igor C. Silveira,
João G. M. Campos,
Laura C. Motheo,
Leticia F. Figueiredo,
Lucas F. A. O. Pellicer,
Marcelo A. José,
Marcos M. José,
Pedro de M. Ligabue,
Ricardo S. Grava,
Rodrigo M. Tavares,
Vinícius B. Matos,
Yan V. Sym,
Anna H. R. Costa,
Anarosa A. F. Brandão,
Denis D. Mauá,
Fabio G. Cozman,
Sarajane M. Peres
Abstract:
We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided…
▽ More
We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided by BLAB is a conversational facility that deals with complex questions about the Blue Amazon, called BLAB-Chat; its central component is a controller that manages several task-oriented natural language processing modules (e.g., question answering and summarizer systems). These modules have access to an internal data lake as well as to third-party databases. A news reporter (BLAB-Reporter) and a purposely-developed wiki (BLAB-Wiki) are also part of the BLAB service architecture. In this paper, we describe our current version of BLAB's architecture (interface, backend, web services, NLP modules, and resources) and comment on the challenges we have faced so far, such as the lack of training data and the scattered state of domain information. Solving these issues presents a considerable challenge in the development of artificial intelligence for technical domains.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Enhancing Oceanic Variables Forecast in the Santos Channel by Estimating Model Error with Random Forests
Authors:
Felipe M. Moreno,
Caio F. D. Netto,
Marcel R. de Barros,
Jefferson F. Coelho,
Lucas P. de Freitas,
Marlon S. Mathias,
Luiz A. Schiaveto Neto,
Marcelo Dottori,
Fabio G. Cozman,
Anna H. R. Costa,
Edson S. Gomi,
Eduardo A. Tannuri
Abstract:
In this work we improve forecasting of Sea Surface Height (SSH) and current velocity (speed and direction) in oceanic scenarios. We do so by resorting to Random Forests so as to predict the error of a numerical forecasting system developed for the Santos Channel in Brazil. We have used the Santos Operational Forecasting System (SOFS) and data collected in situ between the years of 2019 and 2021. I…
▽ More
In this work we improve forecasting of Sea Surface Height (SSH) and current velocity (speed and direction) in oceanic scenarios. We do so by resorting to Random Forests so as to predict the error of a numerical forecasting system developed for the Santos Channel in Brazil. We have used the Santos Operational Forecasting System (SOFS) and data collected in situ between the years of 2019 and 2021. In previous studies we have applied similar methods for current velocity in the channel entrance, in this work we expand the application to improve the SHH forecast and include four other stations in the channel. We have obtained an average reduction of 11.9% in forecasting Root-Mean Square Error (RMSE) and 38.7% in bias with our approach. We also obtained an increase of Agreement (IOA) in 10 of the 14 combinations of forecasted variables and stations.
△ Less
Submitted 22 July, 2022;
originally announced August 2022.
-
Modeling Oceanic Variables with Dynamic Graph Neural Networks
Authors:
Caio F. D. Netto,
Marcel R. de Barros,
Jefferson F. Coelho,
Lucas P. de Freitas,
Felipe M. Moreno,
Marlon S. Mathias,
Marcelo Dottori,
Fábio G. Cozman,
Anna H. R. Costa,
Edson S. Gomi,
Eduardo A. Tannuri
Abstract:
Researchers typically resort to numerical methods to understand and predict ocean dynamics, a key task in mastering environmental phenomena. Such methods may not be suitable in scenarios where the topographic map is complex, knowledge about the underlying processes is incomplete, or the application is time critical. On the other hand, if ocean dynamics are observed, they can be exploited by recent…
▽ More
Researchers typically resort to numerical methods to understand and predict ocean dynamics, a key task in mastering environmental phenomena. Such methods may not be suitable in scenarios where the topographic map is complex, knowledge about the underlying processes is incomplete, or the application is time critical. On the other hand, if ocean dynamics are observed, they can be exploited by recent machine learning methods. In this paper we describe a data-driven method to predict environmental variables such as current velocity and sea surface height in the region of Santos-Sao Vicente-Bertioga Estuarine System in the southeastern coast of Brazil. Our model exploits both temporal and spatial inductive biases by joining state-of-the-art sequence models (LSTM and Transformers) and relational models (Graph Neural Networks) in an end-to-end framework that learns both the temporal features and the spatial relationship shared among observation sites. We compare our results with the Santos Operational Forecasting System (SOFS). Experiments show that better results are attained by our model, while maintaining flexibility and little domain knowledge dependency.
△ Less
Submitted 25 June, 2022;
originally announced June 2022.
-
Tracking environmental policy changes in the Brazilian Federal Official Gazette
Authors:
Flávio Nakasato Cação,
Anna Helena Reali Costa,
Natalie Unterstell,
Liuca Yonaha,
Taciana Stec,
Fábio Ishisaki
Abstract:
Even though most of its energy generation comes from renewable sources, Brazil is one of the largest emitters of greenhouse gases in the world, due to intense farming and deforestation of biomes such as the Amazon Rainforest, whose preservation is essential for compliance with the Paris Agreement. Still, regardless of lobbies or prevailing political orientation, all government legal actions are pu…
▽ More
Even though most of its energy generation comes from renewable sources, Brazil is one of the largest emitters of greenhouse gases in the world, due to intense farming and deforestation of biomes such as the Amazon Rainforest, whose preservation is essential for compliance with the Paris Agreement. Still, regardless of lobbies or prevailing political orientation, all government legal actions are published daily in the Brazilian Federal Official Gazette (BFOG, or "Diário Oficial da União" in Portuguese). However, with hundreds of decrees issued every day by the authorities, it is absolutely burdensome to manually analyze all these processes and find out which ones can pose serious environmental hazards. In this paper, we present a strategy to compose automated techniques and domain expert knowledge to process all the data from the BFOG. We also provide the Government Actions Tracker, a highly curated dataset, in Portuguese, annotated by domain experts, on federal government acts about the Brazilian environmental policies. Finally, we build and compared four different NLP models on the classfication task in this dataset. Our best model achieved a F1-score of $0.714 \pm 0.031$. In the future, this system should serve to scale up the high-quality tracking of all oficial documents with a minimum of human supervision and contribute to increasing society's awareness of government actions.
△ Less
Submitted 11 February, 2022;
originally announced February 2022.
-
Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean
Authors:
André F. A. Paschoal,
Paulo Pirozelli,
Valdinei Freire,
Karina V. Delgado,
Sarajane M. Peres,
Marcos M. José,
Flávio Nakasato,
André S. Oliveira,
Anarosa A. F. Brandão,
Anna H. R. Costa,
Fabio G. Cozman
Abstract:
Current research in natural language processing is highly dependent on carefully produced corpora. Most existing resources focus on English; some resources focus on languages such as Chinese and French; few resources deal with more than one language. This paper presents the Pirá dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English. Pi…
▽ More
Current research in natural language processing is highly dependent on carefully produced corpora. Most existing resources focus on English; some resources focus on languages such as Chinese and French; few resources deal with more than one language. This paper presents the Pirá dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English. Pirá is, to the best of our knowledge, the first QA dataset with supporting texts in Portuguese, and, perhaps more importantly, the first bilingual QA dataset that includes this language. The Pirá dataset consists of 2261 properly curated question/answer (QA) sets in both languages. The QA sets were manually created based on two corpora: abstracts related to the Brazilian coast and excerpts of United Nation reports about the ocean. The QA sets were validated in a peer-review process with the dataset contributors. We discuss some of the advantages as well as limitations of Pirá, as this new resource can support a set of tasks in NLP such as question-answering, information retrieval, and machine translation.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
ZeroBERTo: Leveraging Zero-Shot Text Classification by Topic Modeling
Authors:
Alexandre Alcoforado,
Thomas Palmeira Ferraz,
Rodrigo Gerber,
Enzo Bustos,
André Seidel Oliveira,
Bruno Miguel Veloso,
Fabio Levy Siqueira,
Anna Helena Reali Costa
Abstract:
Traditional text classification approaches often require a good amount of labeled data, which is difficult to obtain, especially in restricted domains or less widespread languages. This lack of labeled data has led to the rise of low-resource methods, that assume low data availability in natural language processing. Among them, zero-shot learning stands out, which consists of learning a classifier…
▽ More
Traditional text classification approaches often require a good amount of labeled data, which is difficult to obtain, especially in restricted domains or less widespread languages. This lack of labeled data has led to the rise of low-resource methods, that assume low data availability in natural language processing. Among them, zero-shot learning stands out, which consists of learning a classifier without any previously labeled data. The best results reported with this approach use language models such as Transformers, but fall into two problems: high execution time and inability to handle long texts as input. This paper proposes a new model, ZeroBERTo, which leverages an unsupervised clustering step to obtain a compressed data representation before the classification task. We show that ZeroBERTo has better performance for long inputs and shorter execution time, outperforming XLM-R by about 12% in the F1 score in the FolhaUOL dataset. Keywords: Low-Resource NLP, Unlabeled data, Zero-Shot Learning, Topic Modeling, Transformers.
△ Less
Submitted 4 June, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
DEBACER: a method for slicing moderated debates
Authors:
Thomas Palmeira Ferraz,
Alexandre Alcoforado,
Enzo Bustos,
André Seidel Oliveira,
Rodrigo Gerber,
Naíde Müller,
André Corrêa d'Almeida,
Bruno Miguel Veloso,
Anna Helena Reali Costa
Abstract:
Subjects change frequently in moderated debates with several participants, such as in parliamentary sessions, electoral debates, and trials. Partitioning a debate into blocks with the same subject is essential for understanding. Often a moderator is responsible for defining when a new block begins so that the task of automatically partitioning a moderated debate can focus solely on the moderator's…
▽ More
Subjects change frequently in moderated debates with several participants, such as in parliamentary sessions, electoral debates, and trials. Partitioning a debate into blocks with the same subject is essential for understanding. Often a moderator is responsible for defining when a new block begins so that the task of automatically partitioning a moderated debate can focus solely on the moderator's behavior. In this paper, we (i) propose a new algorithm, DEBACER, which partitions moderated debates; (ii) carry out a comparative study between conventional and BERTimbau pipelines; and (iii) validate DEBACER applying it to the minutes of the Assembly of the Republic of Portugal. Our results show the effectiveness of DEBACER. Keywords: Natural Language Processing, Political Documents, Spoken Text Processing, Speech Split, Dialogue Partitioning.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Intelligent Trading Systems: A Sentiment-Aware Reinforcement Learning Approach
Authors:
Francisco Caio Lima Paiva,
Leonardo Kanashiro Felizardo,
Reinaldo Augusto da Costa Bianchi,
Anna Helena Reali Costa
Abstract:
The feasibility of making profitable trades on a single asset on stock exchanges based on patterns identification has long attracted researchers. Reinforcement Learning (RL) and Natural Language Processing have gained notoriety in these single-asset trading tasks, but only a few works have explored their combination. Moreover, some issues are still not addressed, such as extracting market sentimen…
▽ More
The feasibility of making profitable trades on a single asset on stock exchanges based on patterns identification has long attracted researchers. Reinforcement Learning (RL) and Natural Language Processing have gained notoriety in these single-asset trading tasks, but only a few works have explored their combination. Moreover, some issues are still not addressed, such as extracting market sentiment momentum through the explicit capture of sentiment features that reflect the market condition over time and assessing the consistency and stability of RL results in different situations. Filling this gap, we propose the Sentiment-Aware RL (SentARL) intelligent trading system that improves profit stability by leveraging market mood through an adaptive amount of past sentiment features drawn from textual news. We evaluated SentARL across twenty assets, two transaction costs, and five different periods and initializations to show its consistent effectiveness against baselines. Subsequently, this thorough assessment allowed us to identify the boundary between news coverage and market sentiment regarding the correlation of price-time series above which SentARL's effectiveness is outstanding.
△ Less
Submitted 14 November, 2021;
originally announced December 2021.
-
PLSUM: Generating PT-BR Wikipedia by Summarizing Multiple Websites
Authors:
André Seidel Oliveira,
Anna Helena Reali Costa
Abstract:
Wikipedia is an important free source of intelligible knowledge. Despite that, Brazilian Portuguese Wikipedia still lacks descriptions for many subjects. In an effort to expand the Brazilian Wikipedia, we contribute PLSum, a framework for generating wiki-like abstractive summaries from multiple descriptive websites. The framework has an extractive stage followed by an abstractive one. In particula…
▽ More
Wikipedia is an important free source of intelligible knowledge. Despite that, Brazilian Portuguese Wikipedia still lacks descriptions for many subjects. In an effort to expand the Brazilian Wikipedia, we contribute PLSum, a framework for generating wiki-like abstractive summaries from multiple descriptive websites. The framework has an extractive stage followed by an abstractive one. In particular, for the abstractive stage, we fine-tune and compare two recent variations of the Transformer neural network, PTT5, and Longformer. To fine-tune and evaluate the model, we created a dataset with thousands of examples, linking reference websites to Wikipedia. Our results show that it is possible to generate meaningful abstractive summaries from Brazilian Portuguese web content.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment
Authors:
Flávio Nakasato Cação,
Marcos Menon José,
André Seidel Oliveira,
Stefano Spindola,
Anna Helena Reali Costa,
Fábio Gagliardi Cozman
Abstract:
The challenge of climate change and biome conservation is one of the most pressing issues of our time - particularly in Brazil, where key environmental reserves are located. Given the availability of large textual databases on ecological themes, it is natural to resort to question answering (QA) systems to increase social awareness and understanding about these topics. In this work, we introduce m…
▽ More
The challenge of climate change and biome conservation is one of the most pressing issues of our time - particularly in Brazil, where key environmental reserves are located. Given the availability of large textual databases on ecological themes, it is natural to resort to question answering (QA) systems to increase social awareness and understanding about these topics. In this work, we introduce multiple QA systems that combine in novel ways the BM25 algorithm, a sparse retrieval technique, with PTT5, a pre-trained state-of-the-art language model. Our QA systems focus on the Portuguese language, thus offering resources not found elsewhere in the literature. As training data, we collected questions from open-domain datasets, as well as content from the Portuguese Wikipedia and news from the press. We thus contribute with innovative architectures and novel applications, attaining an F1-score of 36.2 with our best model.
△ Less
Submitted 19 October, 2021;
originally announced October 2021.