-
A Chatbot for Asylum-Seeking Migrants in Europe
Authors:
Bettina Fazzinga,
Elena Palmieri,
Margherita Vestoso,
Luca Bolognini,
Andrea Galassi,
Filippo Furfaro,
Paolo Torroni
Abstract:
We present ACME: A Chatbot for asylum-seeking Migrants in Europe. ACME relies on computational argumentation and aims to help migrants identify the highest level of protection they can apply for. This would contribute to a more sustainable migration by reducing the load on territorial commissions, Courts, and humanitarian organizations supporting asylum applicants. We describe the context, system…
▽ More
We present ACME: A Chatbot for asylum-seeking Migrants in Europe. ACME relies on computational argumentation and aims to help migrants identify the highest level of protection they can apply for. This would contribute to a more sustainable migration by reducing the load on territorial commissions, Courts, and humanitarian organizations supporting asylum applicants. We describe the context, system architectures, technologies, and the case study used to run the demonstration.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Promoting Fairness and Diversity in Speech Datasets for Mental Health and Neurological Disorders Research
Authors:
Eleonora Mancini,
Ana Tanevska,
Andrea Galassi,
Alessio Galatolo,
Federico Ruggeri,
Paolo Torroni
Abstract:
Current research in machine learning and artificial intelligence is largely centered on modeling and performance evaluation, less so on data collection. However, recent research demonstrated that limitations and biases in data may negatively impact trustworthiness and reliability. These aspects are particularly impactful on sensitive domains such as mental health and neurological disorders, where…
▽ More
Current research in machine learning and artificial intelligence is largely centered on modeling and performance evaluation, less so on data collection. However, recent research demonstrated that limitations and biases in data may negatively impact trustworthiness and reliability. These aspects are particularly impactful on sensitive domains such as mental health and neurological disorders, where speech data are used to develop AI applications aimed at improving the health of patients and supporting healthcare providers. In this paper, we chart the landscape of available speech datasets for this domain, to highlight possible pitfalls and opportunities for improvement and promote fairness and diversity. We present a comprehensive list of desiderata for building speech datasets for mental health and neurological disorders and distill it into a checklist focused on ethical concerns to foster more responsible research.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
ezBIDS: Guided standardization of neuroimaging data interoperable with major data archives and platforms
Authors:
Daniel Levitas,
Soichi Hayashi,
Sophia Vinci-Booher,
Anibal Heinsfeld,
Dheeraj Bhatia,
Nicholas Lee,
Anthony Galassi,
Guiomar Niso,
Franco Pestilli
Abstract:
Data standardization has become one of the leading methods neuroimaging researchers rely on for data sharing and reproducibility. Data standardization promotes a common framework through which researchers can utilize others' data. Yet, as of today, formatting datasets that adhere to community best practices requires technical expertise involving coding and considerable knowledge of file formats an…
▽ More
Data standardization has become one of the leading methods neuroimaging researchers rely on for data sharing and reproducibility. Data standardization promotes a common framework through which researchers can utilize others' data. Yet, as of today, formatting datasets that adhere to community best practices requires technical expertise involving coding and considerable knowledge of file formats and standards. We describe ezBIDS, a tool for converting neuroimaging data and associated metadata to the Brain Imaging Data Structure (BIDS) standard. ezBIDS provides four unique features: (1) No installation or programming requirements. (2) Handling of both imaging and task events data and metadata. (3) Automated inference and guidance for adherence to BIDS. (4) Multiple data management options: download BIDS data to local system, or transfer to OpenNeuro.org or brainlife.io. In sum, ezBIDS requires neither coding proficiency nor knowledge of BIDS and is the first BIDS tool to offer guided standardization, support for task events conversion, and interoperability with OpenNeuro and brainlife.io.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
The Past, Present, and Future of the Brain Imaging Data Structure (BIDS)
Authors:
Russell A. Poldrack,
Christopher J. Markiewicz,
Stefan Appelhoff,
Yoni K. Ashar,
Tibor Auer,
Sylvain Baillet,
Shashank Bansal,
Leandro Beltrachini,
Christian G. Benar,
Giacomo Bertazzoli,
Suyash Bhogawar,
Ross W. Blair,
Marta Bortoletto,
Mathieu Boudreau,
Teon L. Brooks,
Vince D. Calhoun,
Filippo Maria Castelli,
Patricia Clement,
Alexander L Cohen,
Julien Cohen-Adad,
Sasha D'Ambrosio,
Gilles de Hollander,
María de la iglesia-Vayá,
Alejandro de la Vega,
Arnaud Delorme
, et al. (89 additional authors not shown)
Abstract:
The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves.…
▽ More
The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves. We also discuss the lessons learned through the project, with the aim of enabling researchers in other domains to learn from the success of BIDS.
△ Less
Submitted 8 January, 2024; v1 submitted 11 September, 2023;
originally announced September 2023.
-
A Corpus for Sentence-level Subjectivity Detection on English News Articles
Authors:
Francesco Antici,
Andrea Galassi,
Federico Ruggeri,
Katerina Korre,
Arianna Muti,
Alessandra Bardi,
Alice Fedotova,
Alberto Barrón-Cedeño
Abstract:
We develop novel annotation guidelines for sentence-level subjectivity detection, which are not limited to language-specific cues. We use our guidelines to collect NewsSD-ENG, a corpus of 638 objective and 411 subjective sentences extracted from English news articles on controversial topics. Our corpus paves the way for subjectivity detection in English and across other languages without relying o…
▽ More
We develop novel annotation guidelines for sentence-level subjectivity detection, which are not limited to language-specific cues. We use our guidelines to collect NewsSD-ENG, a corpus of 638 objective and 411 subjective sentences extracted from English news articles on controversial topics. Our corpus paves the way for subjectivity detection in English and across other languages without relying on language-specific tools, such as lexicons or machine translation. We evaluate state-of-the-art multilingual transformer-based models on the task in mono-, multi-, and cross-language settings. For this purpose, we re-annotate an existing Italian corpus. We observe that models trained in the multilingual setting achieve the best performance on the task.
△ Less
Submitted 24 May, 2024; v1 submitted 29 May, 2023;
originally announced May 2023.
-
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
Authors:
Joel Niklaus,
Veton Matoshi,
Pooja Rani,
Andrea Galassi,
Matthias Stürmer,
Ilias Chalkidis
Abstract:
Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP field has enjoyed spectacular growth. To measure progress, well curated and challenging benchmarks are crucial. However, most benchmarks are English only and in legal NLP specifically there is no multilingual benchmark available yet. Additionally, many benchmarks are saturated, with the best models clea…
▽ More
Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP field has enjoyed spectacular growth. To measure progress, well curated and challenging benchmarks are crucial. However, most benchmarks are English only and in legal NLP specifically there is no multilingual benchmark available yet. Additionally, many benchmarks are saturated, with the best models clearly outperforming the best humans and achieving near perfect scores. We survey the legal NLP literature and select 11 datasets covering 24 languages, creating LEXTREME. To provide a fair comparison, we propose two aggregate scores, one based on the datasets and one on the languages. The best baseline (XLM-R large) achieves both a dataset aggregate score a language aggregate score of 61.3. This indicates that LEXTREME is still very challenging and leaves ample room for improvement. To make it easy for researchers and practitioners to use, we release LEXTREME on huggingface together with all the code required to evaluate models and a public Weights and Biases project with all the runs.
△ Less
Submitted 8 January, 2024; v1 submitted 30 January, 2023;
originally announced January 2023.
-
An Argumentative Dialogue System for COVID-19 Vaccine Information
Authors:
Bettina Fazzinga,
Andrea Galassi,
Paolo Torroni
Abstract:
Dialogue systems are widely used in AI to support timely and interactive communication with users. We propose a general-purpose dialogue system architecture that leverages computational argumentation to perform reasoning and provide consistent and explainable answers. We illustrate the system using a COVID-19 vaccine information case study.
Dialogue systems are widely used in AI to support timely and interactive communication with users. We propose a general-purpose dialogue system architecture that leverages computational argumentation to perform reasoning and provide consistent and explainable answers. We illustrate the system using a COVID-19 vaccine information case study.
△ Less
Submitted 15 October, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Multi-Task Attentive Residual Networks for Argument Mining
Authors:
Andrea Galassi,
Marco Lippi,
Paolo Torroni
Abstract:
We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persua…
▽ More
We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persuasive essays. Our results show that our approach is a strong competitor against state-of-the-art architectures with a higher computational footprint or corpus-specific design, representing an interesting compromise between generality, performance accuracy and reduced model size.
△ Less
Submitted 25 May, 2023; v1 submitted 24 February, 2021;
originally announced February 2021.
-
An Upper Bound on the Complexity of Tablut
Authors:
Andrea Galassi
Abstract:
Tablut is a complete-knowledge, deterministic, and asymmetric board game, which has not been solved nor properly studied yet. In this work, its rules and characteristics are presented, then a study on its complexity is reported. An upper bound to its complexity is found eventually by dividing the state-space of the game into subspaces according to specific conditions. This upper bound is comparabl…
▽ More
Tablut is a complete-knowledge, deterministic, and asymmetric board game, which has not been solved nor properly studied yet. In this work, its rules and characteristics are presented, then a study on its complexity is reported. An upper bound to its complexity is found eventually by dividing the state-space of the game into subspaces according to specific conditions. This upper bound is comparable to the one found for Draughts, therefore, it would seem that the open challenge of solving this game requires a considerable computational effort.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
Neural-Symbolic Argumentation Mining: an Argument in Favor of Deep Learning and Reasoning
Authors:
Andrea Galassi,
Kristian Kersting,
Marco Lippi,
Xiaoting Shao,
Paolo Torroni
Abstract:
Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal.
Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal.
△ Less
Submitted 28 January, 2020; v1 submitted 22 May, 2019;
originally announced May 2019.
-
Attention in Natural Language Processing
Authors:
Andrea Galassi,
Marco Lippi,
Paolo Torroni
Abstract:
Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those desig…
▽ More
Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.
△ Less
Submitted 11 October, 2021; v1 submitted 4 February, 2019;
originally announced February 2019.