Automated Classification of Pathology Reports

Stud Health Technol Inform. 2015:216:1040.

Abstract

This work develops an automated classifier of pathology reports which infers the topography and the morphology classes of a tumor using codes from the International Classification of Diseases for Oncology (ICD-O). Data from 94,980 patients of the A.C. Camargo Cancer Center was used for training and validation of Naive Bayes classifiers, evaluated by the F1-score. Measures greater than 74% in the topographic group and 61% in the morphologic group are reported. Our work provides a successful baseline for future research for the classification of medical documents written in Portuguese and in other domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brazil / epidemiology
  • Data Mining / methods*
  • Decision Support Systems, Clinical
  • Diagnosis, Computer-Assisted / methods*
  • International Classification of Diseases
  • Natural Language Processing*
  • Neoplasms / diagnosis*
  • Neoplasms / epidemiology
  • Neoplasms / pathology*
  • Pathology / classification*
  • Pattern Recognition, Automated / methods
  • Prevalence
  • Reproducibility of Results
  • Sensitivity and Specificity