Auto-selection of DRG codes from discharge summaries by text mining in several hospitals: analysis of difference of discharge summaries

Stud Health Technol Inform. 2010;160(Pt 2):1020-4.

Abstract

Recently, electronic medical record (EMR) systems have become popular in Japan, and number of discharge summaries is stored electronically, though they have not been reutilized yet. We performed text mining with Tf-idf method and morphological analysis in the discharge summaries from three Hospitals (Chiba University Hospital, St. Luke's International Hospital and Saga University Hospital). We showed differences in the styles of summaries, between hospitals, while the rate of properly classified DPC (Diagnosis Procedure Combination) codes were almost the same. Beyond different styles of the discharge summaries, text mining method could obtain proper extracts of proper DPC codes. Improvement was observed by using integrated model data between the hospitals. It seemed that huge database which contains the data of many hospitals can improve the precision of text mining.

MeSH terms

  • Data Mining / methods*
  • Databases, Factual
  • Diagnosis-Related Groups
  • Electronic Health Records
  • Hospitals
  • Humans
  • Patient Discharge*