The incremental design of a machine learning framework for medical records processing

J Am Med Inform Assoc. 2024 Oct 1;31(10):2236-2245. doi: 10.1093/jamia/ocae194.

Abstract

Objectives: This work presents the development and evaluation of coordn8, a web-based application that streamlines fax processing in outpatient clinics using a "human-in-the-loop" machine learning framework. We demonstrate the effectiveness of the platform at reducing fax processing time and producing accurate machine learning inferences across the tasks of patient identification, document classification, spam classification, and duplicate document detection.

Methods: We deployed coordn8 in 11 outpatient clinics and conducted a time savings analysis by observing users and measuring fax processing event logs. We used statistical methods to evaluate the machine learning components across different datasets to show generalizability. We conducted a time series analysis to show variations in model performance as new clinics were onboarded and to demonstrate our approach to mitigating model drift.

Results: Our observation analysis showed a mean reduction in individual fax processing time by 147.5 s, while our event log analysis of over 7000 faxes reinforced this finding. Document classification produced an accuracy of 81.6%, patient identification produced an accuracy of 83.7%, spam classification produced an accuracy of 98.4%, and duplicate document detection produced a precision of 81.0%. Retraining document classification increased accuracy by 10.2%.

Discussion: coordn8 significantly decreased fax-processing time and produced accurate machine learning inferences. Our human-in-the-loop framework facilitated the collection of high-quality data necessary for model training. Expanding to new clinics correlated with performance decline, which was mitigated through model retraining.

Conclusion: Our framework for automating clinical tasks with machine learning offers a template for health systems looking to implement similar technologies.

Keywords: Natural Language Processing; clinical informatics; healthcare innovation; machine learning; medical records; practice management.

MeSH terms

  • Ambulatory Care Facilities
  • Electronic Health Records*
  • Humans
  • Machine Learning*