Abstract
User acceptance of artificial intelligence agents might depend on their
ability to explain their reasoning, which requires adding an interpretability
layer that fa- cilitates users to understand their behavior. This paper focuses
on adding an in- terpretable layer on top of Semantic Textual Similarity (STS),
which measures the degree of semantic equivalence between two sentences. The
interpretability layer is formalized as the alignment between pairs of segments
across the two sentences, where the relation between the segments is labeled
with a relation type and a similarity score. We present a publicly available
dataset of sentence pairs annotated following the formalization. We then
develop a system trained on this dataset which, given a sentence pair, explains
what is similar and different, in the form of graded and typed segment
alignments. When evaluated on the dataset, the system performs better than an
informed baseline, showing that the dataset and task are well-defined and
feasible. Most importantly, two user studies show how the system output can be
used to automatically produce explanations in natural language. Users performed
better when having access to the explanations, pro- viding preliminary evidence
that our dataset and method to automatically produce explanations is useful in
real applications.
Users
Please
log in to take part in the discussion (add own reviews or comments).