Search | arXiv e-print repository

INCLUSIFY: A benchmark and a model for gender-inclusive German

Abstract: Gender-inclusive language is important for achieving gender equality in languages with gender inflections, such as German. While stirring some controversy, it is increasingly adopted by companies and political institutions. A handful of tools have been developed to help people use gender-inclusive language by identifying instances of the generic masculine and providing suggestions for more inclusi… ▽ More Gender-inclusive language is important for achieving gender equality in languages with gender inflections, such as German. While stirring some controversy, it is increasingly adopted by companies and political institutions. A handful of tools have been developed to help people use gender-inclusive language by identifying instances of the generic masculine and providing suggestions for more inclusive reformulations. In this report, we define the underlying tasks in terms of natural language processing, and present a dataset and measures for benchmarking them. We also present a model that implements these tasks, by combining an inclusive language database with an elaborate sequence of processing steps via standard pre-trained models. Our model achieves a recall of 0.89 and a precision of 0.82 in our benchmark for identifying exclusive language; and one of its top five suggestions is chosen in real-world texts in 44% of cases. We sketch how the area could be further advanced by training end-to-end models and using large language models; and we urge the community to include more gender-inclusive texts in their training data in order to not present an obstacle to the adoption of gender-inclusive language. Through these efforts, we hope to contribute to restoring justice in language and, to a small extent, in reality. △ Less

Submitted 5 December, 2022; originally announced December 2022.

arXiv:2202.00383 [pdf, other]

Explainable AI through the Learning of Arguments

Authors: Jonas Bei, David Pomerenke, Lukas Schreiner, Sepideh Sharbaf, Pieter Collins, Nico Roos

Abstract: Learning arguments is highly relevant to the field of explainable artificial intelligence. It is a family of symbolic machine learning techniques that is particularly human-interpretable. These techniques learn a set of arguments as an intermediate representation. Arguments are small rules with exceptions that can be chained to larger arguments for making predictions or decisions. We investigate t… ▽ More Learning arguments is highly relevant to the field of explainable artificial intelligence. It is a family of symbolic machine learning techniques that is particularly human-interpretable. These techniques learn a set of arguments as an intermediate representation. Arguments are small rules with exceptions that can be chained to larger arguments for making predictions or decisions. We investigate the learning of arguments, specifically the learning of arguments from a 'case model' proposed by Verheij [34]. The case model in Verheij's approach are cases or scenarios in a legal setting. The number of cases in a case model are relatively low. Here, we investigate whether Verheij's approach can be used for learning arguments from other types of data sets with a much larger number of instances. We compare the learning of arguments from a case model with the HeRO algorithm [15] and learning a decision tree. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: Presented at the 33rd BeNeLux AI Conference (BNAIC/BENELEARN 2021)

arXiv:1908.00500 [pdf, other]

doi 10.1109/VISUAL.2019.8933706

Slope-Dependent Rendering of Parallel Coordinates to Reduce Density Distortion and Ghost Clusters

Authors: David Pomerenke, Frederik L. Dennig, Daniel A. Keim, Johannes Fuchs, Michael Blumenschein

Abstract: Parallel coordinates are a popular technique to visualize multi-dimensional data. However, they face a significant problem influencing the perception and interpretation of patterns. The distance between two parallel lines differs based on their slope. Vertical lines are rendered longer and closer to each other than horizontal lines. This problem is inherent in the technique and has two main conseq… ▽ More Parallel coordinates are a popular technique to visualize multi-dimensional data. However, they face a significant problem influencing the perception and interpretation of patterns. The distance between two parallel lines differs based on their slope. Vertical lines are rendered longer and closer to each other than horizontal lines. This problem is inherent in the technique and has two main consequences: (1) clusters which have a steep slope between two axes are visually more prominent than horizontal clusters. (2) Noise and clutter can be perceived as clusters, as a few parallel vertical lines visually emerge as a ghost cluster. Our paper makes two contributions: First, we formalize the problem and show its impact. Second, we present a novel technique to reduce the effects by rendering the polylines of the parallel coordinates based on their slope: horizontal lines are rendered with the default width, lines with a steep slope with a thinner line. Our technique avoids density distortions of clusters, can be computed in linear time, and can be added on top of most parallel coordinate variations. To demonstrate the usefulness, we show examples and compare them to the classical rendering. △ Less

Submitted 23 April, 2020; v1 submitted 1 August, 2019; originally announced August 2019.

Comments: 5 pages, 5 figures, LaTeX; added DOI

Journal ref: 2019 IEEE Visualization Conference (VIS)

Showing 1–3 of 3 results for author: Pomerenke, D