Validation of the Language ENvironment Analysis (LENA) Automated Speech Processing Algorithm Labels for Adult and Child Segments in a Sample of Families From India

J Speech Lang Hear Res. 2025 Jan 2;68(1):40-53. doi: 10.1044/2024_JSLHR-24-00099. Epub 2024 Dec 5.

Abstract

Purpose: The Language ENvironment Analysis (LENA) technology uses automated speech processing (ASP) algorithms to estimate counts such as total adult words and child vocalizations, which helps understand children's early language environment. This ASP has been validated in North American English and other languages in predominantly monolingual contexts but not in a multilingual context like India. Thus, the current study aims to validate the classification accuracy of the LENA algorithm specifically focusing on speaker recognition of adult segments (AdS) and child segments (ChS) in a sample of bi/multilingual families from India.

Method: Thirty neurotypical children between 6 and 24 months (M = 12.89, SD = 4.95) were recruited. Participants were growing up in bi/multilingual environment hearing a combination of Kannada, Tamil, Malayalam, Telugu, Hindi, and/or English. Daylong audio recordings were collected using LENA and processed using the ASP to automatically detect segments across speaker categories. Two human annotators manually annotated ~900 min (37,431 segments across speaker categories). Performance accuracy (recall and precision) was calculated for AdS and ChS.

Results: The recall and precision for AdS were 0.62 (95% confidence interval [CI] [0.61, 0.63]) and 0.83 (95% CI [0.8, 0.83]), respectively. This indicated that 62% of the segments identified as AdS by the human annotator were also identified as AdS by the LENA ASP algorithm and 83% of the segments labeled by the LENA ASP as AdS were also labeled by the human annotator as AdS. Similarly, the recall and precision for ChS were 0.65 (95% CI [0.64, 0.66]) and 0.55 (95% CI [0.54, 0.56]), respectively.

Conclusions: This study documents the performance of the ASP in correctly classifying speakers as adult or child in a sample of families from India, indicating recall and precision that is relatively low. This study lays the groundwork for future investigations aiming to refine the algorithm models, potentially facilitating more accurate performance in bi/multilingual societies like India.

Supplemental material: https://doi.org/10.23641/asha.27910710.

Publication types

  • Validation Study

MeSH terms

  • Adult
  • Algorithms*
  • Child Language
  • Child, Preschool
  • Family
  • Female
  • Humans
  • India
  • Infant
  • Male
  • Multilingualism
  • Reproducibility of Results
  • Speech
  • Speech Recognition Software