Machine learning to predict notes for chart review in the oncology setting: a proof of concept strategy for improving clinician note-writing

J Am Med Inform Assoc. 2024 Jun 20;31(7):1578-1582. doi: 10.1093/jamia/ocae092.

Abstract

Objective: Leverage electronic health record (EHR) audit logs to develop a machine learning (ML) model that predicts which notes a clinician wants to review when seeing oncology patients.

Materials and methods: We trained logistic regression models using note metadata and a Term Frequency Inverse Document Frequency (TF-IDF) text representation. We evaluated performance with precision, recall, F1, AUC, and a clinical qualitative assessment.

Results: The metadata only model achieved an AUC 0.930 and the metadata and TF-IDF model an AUC 0.937. Qualitative assessment revealed a need for better text representation and to further customize predictions for the user.

Discussion: Our model effectively surfaces the top 10 notes a clinician wants to review when seeing an oncology patient. Further studies can characterize different types of clinician users and better tailor the task for different care settings.

Conclusion: EHR audit logs can provide important relevance data for training ML models that assist with note-writing in the oncology setting.

Keywords: electronic health record; machine learning; natural language processing; note writing.

MeSH terms

  • Electronic Health Records*
  • Humans
  • Logistic Models
  • Machine Learning*
  • Medical Audit
  • Medical Oncology*
  • Metadata
  • Proof of Concept Study