Quantifying the intensity of animals' reaction to stimuli is notoriously difficult as classic unidimensional measures of responses such as latency or duration of looking can fail to capture the overall strength of behavioural responses. More holistic rating can be useful but have the inherent risks of subjective bias and lack of repeatability. Here, we explored whether crowdsourcing could be used to efficiently and reliably overcome these potential flaws. A total of 396 participants watched online videos of dogs reacting to auditory stimuli and provided 23,248 ratings of the strength of the dogs' responses from zero (default) to 100 using an online survey form. We found that raters achieved very high inter-rater reliability across multiple datasets (although their responses were affected by their sex, age, and attitude towards animals) and that as few as 10 raters could be used to achieve a reliable result. A linear mixed model applied to PCA components of behaviours discovered that the dogs' facial expressions and head orientation influenced the strength of behaviour ratings the most. Further linear mixed models showed that that strength of behaviour ratings was moderately correlated to the duration of dogs' reactions but not to dogs' reaction latency (from the stimulus onset). This suggests that observers' ratings captured consistent dimensions of animals' responses that are not fully represented by more classic unidimensional metrics. Finally, we report that overall participants strongly enjoyed the experience. Thus, we suggest that using crowdsourcing can offer a useful, repeatable tool to assess behavioural intensity in experimental or observational studies where unidimensional coding may miss nuance, or where coding multiple dimensions may be too time-consuming.
Keywords: Analysing behaviour; Animal behaviour metrics; Coding behaviour; Crowd-sourcing data analysis; Measuring behaviour; Rating behaviour.
© 2021. The Author(s).