Collaborative Filtering for the Imputation of Patient Reported Outcomes

Database Expert Syst Appl (2024). 2024 Aug:14910:231-248. doi: 10.1007/978-3-031-68309-1_20. Epub 2024 Aug 18.

Abstract

This study addresses the prevalent issue of missing data in patient-reported outcome datasets, particularly focusing on head and neck cancer patient symptom ratings sourced from the MD Anderson Symptom Inventory. Given that many data mining and machine learning algorithms necessitate complete datasets, the accurate imputation of missing data as an initial step becomes crucial. In this study we propose, for the first time, the use of collaborative filtering for imputing missing head and neck cancer patient symptom ratings. Two configurations of collaborative filtering, namely patient-based and symptom-based, leverage known ratings to infer the missing ones. Additionally, this study compares the performance of collaborative filtering with alternative imputation methods such as Multiple Imputation by Chained Equations, Nearest Neighbor Imputation, and Linear interpolation. Performance is compared using Root Mean Squared Error and Mean Absolute Error metrics. Findings demonstrate that collaborative filtering is a viable and comparatively superior approach for imputing missing patient symptom data.

Keywords: Collaborative Filtering; Head and Neck Cancer; Imputation.