Identifying free-text features to improve automated classification of structured histopathology reports for feline small intestinal disease

J Vet Diagn Invest. 2018 Mar;30(2):211-217. doi: 10.1177/1040638717744002. Epub 2017 Nov 30.

Abstract

The histologic evaluation of gastrointestinal (GI) biopsies is the standard for diagnosis of a variety of GI diseases (e.g., inflammatory bowel disease [IBD] and alimentary lymphoma [ALA]). The World Small Animal Veterinary Association (WSAVA) Gastrointestinal International Standardization Group proposed a reporting standard for GI biopsies consisting of a defined set of microscopic features. We compared the machine classification accuracy of free-text microscopic findings with those represented in the WSAVA format with a diagnosis of IBD and ALA. Unstructured free-text duodenal biopsy pathology reports from cats ( n = 60) with a diagnosis of IBD ( n = 20), ALA ( n = 20), or normal ( n = 20) were identified. Biopsy samples from these cases were then scored following the WSAVA guidelines to create a set of structured reports. Three supervised machine-learning algorithms were trained using the structured and then the unstructured reports. Diagnosis classification accuracy for the 3 algorithms was compared using the structured and unstructured reports. Using naive Bayes and neural networks, unstructured information-based models achieved higher diagnostic accuracy (0.90 and 0.88, respectively) compared to the structured information-based models (0.74 and 0.72, respectively). Results suggest that discriminating diagnostic information was lost using current WSAVA microscopic guideline features. Addition of free-text features (number of plasma cells) increased WSAVA auto-classification performance. The methodologies reported in our study represent a way of identifying candidate microscopic features for use in structured histopathology reports.

Keywords: Histopathology report; machine learning; structured report; text mining.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms
  • Animals
  • Bayes Theorem
  • Biopsy / veterinary
  • Cat Diseases / diagnosis*
  • Cat Diseases / pathology
  • Cats
  • Diagnostic Techniques and Procedures / veterinary
  • Duodenum / pathology
  • Female
  • Gastrointestinal Neoplasms / diagnosis
  • Gastrointestinal Neoplasms / veterinary*
  • Inflammatory Bowel Diseases / diagnosis
  • Inflammatory Bowel Diseases / veterinary
  • Lymphoma / diagnosis
  • Lymphoma / veterinary
  • Machine Learning
  • Male
  • Neural Networks, Computer