Objective: The Hoarseness Diagram, a program for voice quality analysis used in German-speaking countries, was compared with an automatic speech recognition system with a module for prosodic analysis. The latter computed prosodic features on the basis of a text recording. We examined whether voice analysis of sustained vowels and text analysis correlate in tracheoesophageal speakers.
Patients and methods: Test speakers were 24 male laryngectomees with tracheoesophageal substitute speech, age 60.6 +/- 8.9 years. Each person read the German version of the text 'The North Wind and the Sun'. Additionally, five sustained vowels were recorded from each patient. The fundamental frequency (F(0)) detected by both programs was compared for all vowels. The correlation between the measures obtained by the Hoarseness Diagram and the features from the prosody module was computed.
Results: Both programs have problems in determining the F(0) of highly pathologic voices. Parameters like jitter, shimmer, F(0), and irregularity as computed by the Hoarseness Diagram from vowels show correlations of about -0.8 with prosodic features obtained from the text recordings.
Conclusion: Voice properties can reliably be evaluated both on the basis of vowel and text recordings. Text analysis, however, also offers possibilities for the automatic evaluation of running speech since it realistically represents everyday speech.
Copyright 2009 S. Karger AG, Basel.