Explainable human-centered traits from head motion and facial expression dynamics

PLoS One. 2025 Jan 17;20(1):e0313883. doi: 10.1371/journal.pone.0313883. eCollection 2025.

Abstract

We explore the efficacy of multimodal behavioral cues for explainable prediction of personality and interview-specific traits. We utilize elementary head-motion units named kinemes, atomic facial movements termed action units and speech features to estimate these human-centered traits. Empirical results confirm that kinemes and action units enable discovery of multiple trait-specific behaviors while also enabling explainability in support of the predictions. For fusing cues, we explore decision and feature-level fusion, and an additive attention-based fusion strategy which quantifies the relative importance of the three modalities for trait prediction. Examining various long-short term memory (LSTM) architectures for classification and regression on the MIT Interview and First Impressions Candidate Screening (FICS) datasets, we note that: (1) Multimodal approaches outperform unimodal counterparts, achieving the highest PCC of 0.98 for Excited-Friendly traits in MIT and 0.57 for Extraversion in FICS; (2) Efficient trait predictions and plausible explanations are achieved with both unimodal and multimodal approaches, and (3) Following the thin-slice approach, effective trait prediction is achieved even from two-second behavioral snippets. Our implementation code is available at: https://github.com/deepsurbhi8/Explainable_Human_Traits_Prediction.

MeSH terms

  • Facial Expression*
  • Female
  • Head Movements* / physiology
  • Humans
  • Male
  • Personality*