Stuttering is a neuro-developmental speech disorder that interrupts the flow of speech due to involuntary pauses and sound repetitions. It has profound psychological impacts that affect social interactions and professional advancements. Automatically detecting stuttering events in speech recordings could assist speech therapists or speech pathologists track the fluency of people who stutter (PWS). It will also assist in the improvement of the existing speech recognition system for PWS. In this paper, the SEP-28k dataset is utilized to perform comparative analysis to assess the performance of various machine learning models in classifying the five dysfluency types namely Prolongation, Interjection, Word Repetition, Sound Repetition and Blocks.•The study focuses on automatically detecting stuttering events in speech recordings to support speech therapists and improve speech recognition systems for people who stutter (PWS).•The SEP-28k dataset is used to perform a comparative analysis of different machine learning models.•The research examines the impact of key acoustic features on model accuracy while addressing challenges such as class imbalance.
Keywords: Automatic dysfluency detection; Comparative analysis; Machine learning; Speech disorder; Stuttering; Support Vector Classifier, Random Forest, Decision Tree, K-Nearest Neighbors, Logistic Regression.
© 2024 The Authors. Published by Elsevier B.V.