Search | arXiv e-print repository

A Chatbot for Asylum-Seeking Migrants in Europe

Authors: Bettina Fazzinga, Elena Palmieri, Margherita Vestoso, Luca Bolognini, Andrea Galassi, Filippo Furfaro, Paolo Torroni

Abstract: We present ACME: A Chatbot for asylum-seeking Migrants in Europe. ACME relies on computational argumentation and aims to help migrants identify the highest level of protection they can apply for. This would contribute to a more sustainable migration by reducing the load on territorial commissions, Courts, and humanitarian organizations supporting asylum applicants. We describe the context, system… ▽ More We present ACME: A Chatbot for asylum-seeking Migrants in Europe. ACME relies on computational argumentation and aims to help migrants identify the highest level of protection they can apply for. This would contribute to a more sustainable migration by reducing the load on territorial commissions, Courts, and humanitarian organizations supporting asylum applicants. We describe the context, system architectures, technologies, and the case study used to run the demonstration. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2406.04116 [pdf, ps, other]

Promoting Fairness and Diversity in Speech Datasets for Mental Health and Neurological Disorders Research

Authors: Eleonora Mancini, Ana Tanevska, Andrea Galassi, Alessio Galatolo, Federico Ruggeri, Paolo Torroni

Abstract: Current research in machine learning and artificial intelligence is largely centered on modeling and performance evaluation, less so on data collection. However, recent research demonstrated that limitations and biases in data may negatively impact trustworthiness and reliability. These aspects are particularly impactful on sensitive domains such as mental health and neurological disorders, where… ▽ More Current research in machine learning and artificial intelligence is largely centered on modeling and performance evaluation, less so on data collection. However, recent research demonstrated that limitations and biases in data may negatively impact trustworthiness and reliability. These aspects are particularly impactful on sensitive domains such as mental health and neurological disorders, where speech data are used to develop AI applications aimed at improving the health of patients and supporting healthcare providers. In this paper, we chart the landscape of available speech datasets for this domain, to highlight possible pitfalls and opportunities for improvement and promote fairness and diversity. We present a comprehensive list of desiderata for building speech datasets for mental health and neurological disorders and distill it into a checklist focused on ethical concerns to foster more responsible research. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 34 pages

arXiv:2311.04912 [pdf]

doi 10.1038/s41597-024-02959-0

ezBIDS: Guided standardization of neuroimaging data interoperable with major data archives and platforms

Authors: Daniel Levitas, Soichi Hayashi, Sophia Vinci-Booher, Anibal Heinsfeld, Dheeraj Bhatia, Nicholas Lee, Anthony Galassi, Guiomar Niso, Franco Pestilli

Abstract: Data standardization has become one of the leading methods neuroimaging researchers rely on for data sharing and reproducibility. Data standardization promotes a common framework through which researchers can utilize others' data. Yet, as of today, formatting datasets that adhere to community best practices requires technical expertise involving coding and considerable knowledge of file formats an… ▽ More Data standardization has become one of the leading methods neuroimaging researchers rely on for data sharing and reproducibility. Data standardization promotes a common framework through which researchers can utilize others' data. Yet, as of today, formatting datasets that adhere to community best practices requires technical expertise involving coding and considerable knowledge of file formats and standards. We describe ezBIDS, a tool for converting neuroimaging data and associated metadata to the Brain Imaging Data Structure (BIDS) standard. ezBIDS provides four unique features: (1) No installation or programming requirements. (2) Handling of both imaging and task events data and metadata. (3) Automated inference and guidance for adherence to BIDS. (4) Multiple data management options: download BIDS data to local system, or transfer to OpenNeuro.org or brainlife.io. In sum, ezBIDS requires neither coding proficiency nor knowledge of BIDS and is the first BIDS tool to offer guided standardization, support for task events conversion, and interoperability with OpenNeuro and brainlife.io. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2309.05768 [pdf]

The Past, Present, and Future of the Brain Imaging Data Structure (BIDS)

Authors: Russell A. Poldrack, Christopher J. Markiewicz, Stefan Appelhoff, Yoni K. Ashar, Tibor Auer, Sylvain Baillet, Shashank Bansal, Leandro Beltrachini, Christian G. Benar, Giacomo Bertazzoli, Suyash Bhogawar, Ross W. Blair, Marta Bortoletto, Mathieu Boudreau, Teon L. Brooks, Vince D. Calhoun, Filippo Maria Castelli, Patricia Clement, Alexander L Cohen, Julien Cohen-Adad, Sasha D'Ambrosio, Gilles de Hollander, María de la iglesia-Vayá, Alejandro de la Vega, Arnaud Delorme , et al. (89 additional authors not shown)

Abstract: The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves.… ▽ More The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves. We also discuss the lessons learned through the project, with the aim of enabling researchers in other domains to learn from the success of BIDS. △ Less

Submitted 8 January, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2305.18034 [pdf]

A Corpus for Sentence-level Subjectivity Detection on English News Articles

Authors: Francesco Antici, Andrea Galassi, Federico Ruggeri, Katerina Korre, Arianna Muti, Alessandra Bardi, Alice Fedotova, Alberto Barrón-Cedeño

Abstract: We develop novel annotation guidelines for sentence-level subjectivity detection, which are not limited to language-specific cues. We use our guidelines to collect NewsSD-ENG, a corpus of 638 objective and 411 subjective sentences extracted from English news articles on controversial topics. Our corpus paves the way for subjectivity detection in English and across other languages without relying o… ▽ More We develop novel annotation guidelines for sentence-level subjectivity detection, which are not limited to language-specific cues. We use our guidelines to collect NewsSD-ENG, a corpus of 638 objective and 411 subjective sentences extracted from English news articles on controversial topics. Our corpus paves the way for subjectivity detection in English and across other languages without relying on language-specific tools, such as lexicons or machine translation. We evaluate state-of-the-art multilingual transformer-based models on the task in mono-, multi-, and cross-language settings. For this purpose, we re-annotate an existing Italian corpus. We observe that models trained in the multilingual setting achieve the best performance on the task. △ Less

Submitted 24 May, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: LREC-COLING 2024, pages 273-285

arXiv:2301.13126 [pdf, other]

doi 10.18653/v1/2023.findings-emnlp.200

LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain

Authors: Joel Niklaus, Veton Matoshi, Pooja Rani, Andrea Galassi, Matthias Stürmer, Ilias Chalkidis

Abstract: Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP field has enjoyed spectacular growth. To measure progress, well curated and challenging benchmarks are crucial. However, most benchmarks are English only and in legal NLP specifically there is no multilingual benchmark available yet. Additionally, many benchmarks are saturated, with the best models clea… ▽ More Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP field has enjoyed spectacular growth. To measure progress, well curated and challenging benchmarks are crucial. However, most benchmarks are English only and in legal NLP specifically there is no multilingual benchmark available yet. Additionally, many benchmarks are saturated, with the best models clearly outperforming the best humans and achieving near perfect scores. We survey the legal NLP literature and select 11 datasets covering 24 languages, creating LEXTREME. To provide a fair comparison, we propose two aggregate scores, one based on the datasets and one on the languages. The best baseline (XLM-R large) achieves both a dataset aggregate score a language aggregate score of 61.3. This indicates that LEXTREME is still very challenging and leaves ample room for improvement. To make it easy for researchers and practitioners to use, we release LEXTREME on huggingface together with all the code required to evaluate models and a public Weights and Biases project with all the runs. △ Less

Submitted 8 January, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: Published at EMNLP Findings 2023

MSC Class: 68T50 ACM Class: I.2

Journal ref: EMNLP Findings 2023

arXiv:2107.12079 [pdf, other]

doi 10.1007/978-3-030-89391-0_27

An Argumentative Dialogue System for COVID-19 Vaccine Information

Authors: Bettina Fazzinga, Andrea Galassi, Paolo Torroni

Abstract: Dialogue systems are widely used in AI to support timely and interactive communication with users. We propose a general-purpose dialogue system architecture that leverages computational argumentation to perform reasoning and provide consistent and explainable answers. We illustrate the system using a COVID-19 vaccine information case study. Dialogue systems are widely used in AI to support timely and interactive communication with users. We propose a general-purpose dialogue system architecture that leverages computational argumentation to perform reasoning and provide consistent and explainable answers. We illustrate the system using a COVID-19 vaccine information case study. △ Less

Submitted 15 October, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

Comments: 9 pages, 2 figures, Accepted at CLAR 2021

ACM Class: I.2.1; I.2.4

Journal ref: Logic and Argumentation (2021). Lecture Notes in Computer Science, vol 13040

arXiv:2102.12227 [pdf]

doi 10.1109/TASLP.2023.3275040

Multi-Task Attentive Residual Networks for Argument Mining

Authors: Andrea Galassi, Marco Lippi, Paolo Torroni

Abstract: We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persua… ▽ More We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persuasive essays. Our results show that our approach is a strong competitor against state-of-the-art architectures with a higher computational footprint or corpus-specific design, representing an interesting compromise between generality, performance accuracy and reduced model size. △ Less

Submitted 25 May, 2023; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: 16 pages, 3 figures

Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol 31, pp 1877-1892, 2023

arXiv:2101.11934 [pdf, other]

An Upper Bound on the Complexity of Tablut

Authors: Andrea Galassi

Abstract: Tablut is a complete-knowledge, deterministic, and asymmetric board game, which has not been solved nor properly studied yet. In this work, its rules and characteristics are presented, then a study on its complexity is reported. An upper bound to its complexity is found eventually by dividing the state-space of the game into subspaces according to specific conditions. This upper bound is comparabl… ▽ More Tablut is a complete-knowledge, deterministic, and asymmetric board game, which has not been solved nor properly studied yet. In this work, its rules and characteristics are presented, then a study on its complexity is reported. An upper bound to its complexity is found eventually by dividing the state-space of the game into subspaces according to specific conditions. This upper bound is comparable to the one found for Draughts, therefore, it would seem that the open challenge of solving this game requires a considerable computational effort. △ Less

Submitted 28 January, 2021; originally announced January 2021.

Comments: 9 pages, 1 figure

arXiv:1905.09103 [pdf]

doi 10.3389/fdata.2019.00052

Neural-Symbolic Argumentation Mining: an Argument in Favor of Deep Learning and Reasoning

Authors: Andrea Galassi, Kristian Kersting, Marco Lippi, Xiaoting Shao, Paolo Torroni

Abstract: Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal. Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal. △ Less

Submitted 28 January, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

Journal ref: Frontiers in Big Data 2 (2020) 52

arXiv:1902.02181 [pdf]

doi 10.1109/TNNLS.2020.3019893

Attention in Natural Language Processing

Authors: Andrea Galassi, Marco Lippi, Paolo Torroni

Abstract: Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those desig… ▽ More Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain. △ Less

Submitted 11 October, 2021; v1 submitted 4 February, 2019; originally announced February 2019.

Comments: 18 pages, 8 figures

MSC Class: 68T50; 68T05; 68T07 ACM Class: I.2; I.7

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, vol 32, n 10, pp 4291-4308, 2021

Showing 1–11 of 11 results for author: Galassi, A