Crowdsourcing as a tool in the clinical assessment of intelligibility in dysarthria: How to deal with excessive variation

J Commun Disord. 2021 Sep-Oct:93:106135. doi: 10.1016/j.jcomdis.2021.106135. Epub 2021 Jun 17.

Abstract

Purpose: Independent laypersons are essential in the assessment of intelligibility in persons with dysarthria (PWD), as they reflect intelligibility limitations in the most ecologically valid way, without being influenced by familiarity with the speaker. The present work investigated online crowdsourcing as a convenient method to involve lay people as listeners, with the objective of exploring how to constrain the expected variability of crowd-based judgements to make them applicable in clinical diagnostics.

Method: Intelligibility was assessed using a word transcription task administered via crowdsourcing. In study 1, speech samples of 23 PWD were transcribed by 18 crowdworkers each. Four methods of aggregating the intelligibility scores of randomly sampled panels of 4 to 14 listeners were compared for accuracy, i.e. the stability of the resulting intelligibility estimates across different panels, and their validity, i.e. the degree to which they matched data obtained under controlled laboratory conditions ("gold standard"). In addition, we determined an economically acceptable number of crowdworkers per speaker which is needed to obtain accurate and valid intelligibility estimates. Study 2 examined the robustness of the chosen aggregation method against downward outliers due to spamming in a larger sample of 100 PWD.

Results: In study 1, an interworker aggregation method based on negative exponential weightings of the scores as a function of their distance from the "best" listener's score (exponentially weighted mean) outperformed three other methods (median value, arithmetic mean, maximum). Under cost-benefit considerations, an optimum panel size of 9 crowd listeners per examination was determined. Study 2 demonstrated the robustness of this aggregation method against spamming crowd listeners.

Conclusion: Though intelligibility data collected through online crowdsourcing are noisy, accurate and valid intelligibility estimates can be obtained by appropriate aggregation of the raw data. This makes crowdsourcing a suitable method for incorporating real-world perspectives into clinical dysarthria assessment.

Keywords: Crowdsourcing; Dysarthria; Intelligibility; Quality control; Validation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Crowdsourcing*
  • Dysarthria / diagnosis
  • Humans
  • Speech Intelligibility
  • Speech Perception*
  • Speech Production Measurement