Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels

J Voice. 2010 Sep;24(5):540-55. doi: 10.1016/j.jvoice.2008.12.014. Epub 2009 Nov 2.

Abstract

To improve ecological validity, perceptual and instrumental assessment of disordered voice, including overall voice quality, should ideally sample both sustained vowels and continuous speech. This investigation assessed the utility of combining both voice contexts for the purpose of auditory-perceptual ratings as well as acoustic measurement of overall voice quality. Sustained vowel and continuous speech samples from 251 subjects with (n=229) or without (n=22) various voice disorders were concatenated and perceptually rated on overall voice quality by five experienced voice clinicians. After removing the nonvoiced segments within the continuous speech samples, the concatenated samples were analyzed using 13 acoustic measures based on fundamental frequency perturbation, amplitude perturbation, spectral and cepstral analyses. Stepwise multiple regression analysis yielded a six-variable acoustic model for the multiparametric measurement of overall voice quality of the concatenated samples (with a cepstral measure as the main contributor to the prediction of overall voice quality). The correlation of this model with mean ratings of overall voice quality resulted in r(s)=0.78. A cross-validation approach involving the iterated internal cross-correlations with 30 subgroups of 100, 50, and 10 samples confirmed a comparable degree of association. Furthermore, the ability of the model to distinguish voice-disordered from vocally normal participants was assessed using estimates of diagnostic precision including receiver operating characteristic (ROC) curve analysis, sensitivity, and specificity, as well as likelihood ratios (LRs), which adjust for base-rate differences between the groups. Depending on the cutoff criteria employed, the analyses revealed an impressive area under ROC=0.895 as well as respectable sensitivity, specificity, and LR. The results support the diagnostic utility of combining voice samples from both continuous speech and sustained vowels in acoustic and perceptual analysis of disordered voice. The findings are discussed in relation to the extant literature and the need for further refinement of the acoustic algorithm.

MeSH terms

  • Acoustic Stimulation
  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Belgium
  • Case-Control Studies
  • Child
  • Dysphonia / diagnosis*
  • Dysphonia / physiopathology
  • Dysphonia / psychology
  • Female
  • Humans
  • Logistic Models
  • Male
  • Middle Aged
  • Observer Variation
  • Phonetics*
  • Predictive Value of Tests
  • ROC Curve
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Severity of Illness Index
  • Signal Processing, Computer-Assisted
  • Sound Spectrography
  • Speech Acoustics*
  • Speech Perception*
  • Speech Production Measurement
  • Time Factors
  • Voice Quality*
  • Young Adult