Speaker identification in courtroom contexts - Part I: Individual listeners compared to forensic voice comparison based on automatic-speaker-recognition technology

Nabanita Basu; Agnes S Bali; Philip Weber; Claudia Rosas-Aguilar; Gary Edmond; Kristy A Martire; Geoffrey Stewart Morrison

doi:10.1016/j.forsciint.2022.111499

Speaker identification in courtroom contexts - Part I: Individual listeners compared to forensic voice comparison based on automatic-speaker-recognition technology

Forensic Sci Int. 2022 Dec:341:111499. doi: 10.1016/j.forsciint.2022.111499. Epub 2022 Oct 15.

Authors

Nabanita Basu¹, Agnes S Bali², Philip Weber¹, Claudia Rosas-Aguilar³, Gary Edmond⁴, Kristy A Martire², Geoffrey Stewart Morrison⁵

Affiliations

¹ Forensic Data Science Laboratory, Aston University, Birmingham, UK.
² School of Psychology, University of New South Wales, Sydney, New South Wales, Australia.
³ Forensic Data Science Laboratory, Aston University, Birmingham, UK; Instituto de Lingüística y Literatura, Universidad Austral de Chile, Valdivia, Chile.
⁴ School of Law, Society & Criminology, University of New South Wales, Sydney, New South Wales, Australia.
⁵ Forensic Data Science Laboratory, Aston University, Birmingham, UK; Forensic Evaluation Ltd, Birmingham, UK. Electronic address: [email protected].

PMID: 36283276
DOI: 10.1016/j.forsciint.2022.111499

Abstract

Expert testimony is only admissible in common law if it will potentially assist the trier of fact to make a decision that they would not be able to make unaided. The present paper addresses the question of whether speaker identification by an individual lay listener (such as a judge) would be more or less accurate than the output of a forensic-voice-comparison system that is based on state-of-the-art automatic-speaker-recognition technology. Listeners listen to and make probabilistic judgements on pairs of recordings reflecting the conditions of the questioned- and known-speaker recordings in an actual case. Reflecting different courtroom contexts, listeners with different language backgrounds are tested: Some are familiar with the language and accent spoken, some are familiar with the language but less familiar with the accent, and others are less familiar with the language. Also reflecting different courtroom contexts: In one condition listeners make judgements based only on listening, and in another condition listeners make judgements based on both listening to the recordings and considering the likelihood-ratio values output by the forensic-voice-comparison system.

Keywords: Admissibility; Forensic voice comparison; Likelihood ratio; Speaker identification; Validation; x-vector.

MeSH terms

Expert Testimony
Forensic Medicine
Recognition, Psychology
Technology
Voice*