Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

Savoldi, Beatrice; Gaido, Marco; Bentivogli, Luisa; Negri, Matteo; Turchi, Marco

Computer Science > Computation and Language

arXiv:2203.09866 (cs)

[Submitted on 18 Mar 2022]

Title:Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

Authors:Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

View PDF

Abstract:Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages. However, most of current evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions. Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement, marked on a variety of lexical items and parts-of-speech (POS). To overcome this limitation, we enrich the natural, gender-sensitive MuST-SHE corpus (Bentivogli et al., 2020) with two new linguistic annotation layers (POS and agreement chains), and explore to what extent different lexical categories and agreement phenomena are impacted by gender skews. Focusing on speech translation, we conduct a multifaceted evaluation on three language directions (English-French/Italian/Spanish), with models trained on varying amounts of data and different word segmentation techniques. By shedding light on model behaviours, gender bias, and its detection at several levels of granularity, our findings emphasize the value of dedicated analyses beyond aggregated overall results.

Comments:	Accepted at ACL 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2203.09866 [cs.CL]
	(or arXiv:2203.09866v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2203.09866

Submission history

From: Beatrice Savoldi [view email]
[v1] Fri, 18 Mar 2022 11:14:16 UTC (1,229 KB)

Computer Science > Computation and Language

Title:Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators