Using i-vectors from voice features to identify major depressive disorder

J Affect Disord. 2021 Jun 1:288:161-166. doi: 10.1016/j.jad.2021.04.004. Epub 2021 Apr 20.

Abstract

Background: Machine-learning methods using acoustic features in the diagnosis of major depressive disorder (MDD) have insufficient evidence from large-scale samples and clinical trials. This study aimed to evaluate the effectiveness of the promising i-vector method on a large sample of women with recurrent MDD diagnosed clinically, examine its robustness, and provide an explicit acoustic explanation of the i-vectors.

Methods: We collected utterances edited from clinical interview speech records of 785 depressed and 1,023 healthy individuals. Then, we extracted Mel-frequency cepstral coefficient (MFCC) features and MFCC i-vectors from their utterances. To examine the effectiveness of i-vectors, we compared the performance of binary logistic regression between MFCC i-vectors and MFCC features and tested its robustness on different utterance durations. We also determined the correlation between MFCC features and MFCC i-vectors to analyze the acoustic meaning of i-vectors.

Results: The i-vectors improved 7% and 14% of area under the curve (AUC) for MFCC features using different utterances. When the duration is > 40 s, the classification results are stabilized. The i-vectors are consistently correlated to the maximum, minimum, and deviations of MFCC features (either positively or negatively).

Limitations: This study included only women.

Conclusions: The i-vectors can improve 14% of the AUC on a large-scale clinical sample. This system is robust to utterance duration > 40 s. This study provides a foundation for exploring the clinical application of voice features in the diagnosis of MDD.

Keywords: Assessment/Diagnosis; Biological markers; Clinical trials; Computer/internet technology; Depression.

MeSH terms

  • Depressive Disorder, Major* / diagnosis
  • Female
  • Humans
  • Speech Acoustics
  • Voice Disorders*