Effects of Statistical Practices for Longitudinal Group Comparison of the Penetration-Aspiration Scale on Power and Effect Size Estimation: A Monte Carlo Simulation Study

James C Borders; Alessandro A Grande; Carly E A Barbon; Katherine A Hutcheson; Michelle S Troche

doi:10.1007/s00455-024-10738-7

Effects of Statistical Practices for Longitudinal Group Comparison of the Penetration-Aspiration Scale on Power and Effect Size Estimation: A Monte Carlo Simulation Study

Dysphagia. 2024 Aug 17. doi: 10.1007/s00455-024-10738-7. Online ahead of print.

Authors

James C Borders¹, Alessandro A Grande², Carly E A Barbon³, Katherine A Hutcheson^{3

4}, Michelle S Troche⁵

Affiliations

¹ Laboratory for the Study of Upper Airway Dysfunction, Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY, USA. [email protected].
² Department of Statistics, Columbia University, New York, NY, USA.
³ Department of Head and Neck Surgery, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA.
⁴ Department of Radiation Oncology, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA.
⁵ Laboratory for the Study of Upper Airway Dysfunction, Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY, USA.

PMID: 39153045
DOI: 10.1007/s00455-024-10738-7

Abstract

Multiple bolus trials are administered during clinical and research swallowing assessments to comprehensively capture an individual's swallowing function. Despite valuable information obtained from these boluses, it remains common practice to use a single bolus (e.g., the worst score) to describe the degree of dysfunction. Researchers also often collapse continuous or ordinal swallowing measures into categories, potentially exacerbating information loss. These practices may adversely affect statistical power to detect and estimate smaller, yet potentially meaningful, treatment effects. This study sought to examine the impact of aggregating and categorizing penetration-aspiration scale (PAS) scores on statistical power and effect size estimates. We used a Monte Carlo approach to simulate three hypothetical within-subject treatment studies in Parkinson's disease and head and neck cancer across a range of data characteristics (e.g., sample size, number of bolus trials, variability). Different statistical models (aggregated or multilevel) as well as various PAS reduction approaches (i.e., types of categorizations) were performed to examine their impact on power and the accuracy of effect size estimates. Across all scenarios, multilevel models demonstrated higher statistical power to detect group-level longitudinal change and more accurate estimates compared to aggregated (worst score) models. Categorizing PAS scores also reduced power and biased effect size estimates compared to an ordinal approach, though this depended on the type of categorization and baseline PAS distribution. Multilevel models should be considered as a more robust approach for the statistical analysis of multiple boluses administered in standardized swallowing protocols due to its high sensitivity and accuracy to compare group-level changes in swallowing function. Importantly, this finding appears to be consistent across patient populations with distinct pathophysiology (i.e., PD and HNC) and patterns of airway invasion. The decision to categorize a continuous or ordinal outcome should be grounded in the clinical or research question with recognition that scale reduction may negatively affect the quality of statistical inferences in certain scenarios.

Keywords: Deglutition disorders; Dysphagia; Meta-science; Multilevel models; Statistics.

Grants and funding

Council of Academic Programs in Communication Sciences and Disorders/Council of Academic Programs in Communication Sciences and Disorders