Objective: To design and evaluate an interactive data quality (DQ) characterization tool focused on fitness-for-use completeness measures to support researchers' assessment of a dataset.
Materials and methods: Design requirements were identified through a conceptual framework on DQ, literature review, and interviews. The prototype of the tool was developed based on the requirements gathered and was further refined by domain experts. The Fitness-for-Use Tool was evaluated through a within-subjects controlled experiment comparing it with a baseline tool that provides information on missing data based on intrinsic DQ measures. The tools were evaluated on task performance and perceived usability.
Results: The Fitness-for-Use Tool allows users to define data completeness by customizing the measures and its thresholds to fit their research task and provides a data summary based on the customized definition. Using the Fitness-for-Use Tool, study participants were able to accurately complete fitness-for-use assessment in less time than when using the Intrinsic DQ Tool. The study participants perceived that the Fitness-for-Use Tool was more useful in determining the fitness-for-use of a dataset than the Intrinsic DQ Tool.
Discussion: Incorporating fitness-for-use measures in a DQ characterization tool could provide data summary that meets researchers needs. The design features identified in this study has potential to be applied to other biomedical data types.
Conclusion: A tool that summarizes a dataset in terms of fitness-for-use dimensions and measures specific to a research question supports dataset assessment better than a tool that only presents information on intrinsic DQ measures.
Keywords: data quality; fitness trackers; patient-generated health data; usability testing; user-centered design.
© The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: [email protected].