Utility of Machine Learning to Detect Cytomegalovirus in Digital Hematoxylin and Eosin-Stained Slides

Lab Invest. 2023 Oct;103(10):100225. doi: 10.1016/j.labinv.2023.100225. Epub 2023 Jul 30.

Abstract

Rapid and accurate cytomegalovirus (CMV) identification in immunosuppressed or immunocompromised patients presenting with diarrhea is essential for therapeutic management. Due to viral latency, however, the gold standard for CMV diagnosis remains to identify viral cytopathic inclusions on routine hematoxylin and eosin (H&E)-stained tissue sections. Therefore, biopsies may be taken and "rushed" for pathology evaluation. Here, we propose the use of artificial intelligence to detect CMV inclusions on routine H&E-stained whole-slide images to aid pathologists in evaluating these cases. Fifty-eight representative H&E slides from 30 cases with CMV inclusions were identified and scanned. The resulting whole-slide images were manually annotated for CMV inclusions and tiled into 300 × 300 pixel patches. Patches containing annotations were labeled "positive," and these tiles were oversampled with image augmentation to account for class imbalance. The remaining patches were labeled "negative." Data were then divided into training, validation, and holdout sets. Multiple deep learning models were provided with training data, and their performance was analyzed. All tested models showed excellent performance. The highest performance was seen using the EfficientNetV2BO model, which had a test (holdout) accuracy of 99.93%, precision of 100.0%, recall (sensitivity) of 99.85%, and area under the curve of 0.9998. Of 518,941 images in the holdout set, there were only 346 false negatives and 2 false positives. This shows proof of concept for the use of digital tools to assist pathologists in screening "rush" biopsies for CMV infection. Given the high precision, cases screened as "positive" can be quickly confirmed by a pathologist, reducing missed CMV inclusions and improving the confidence of preliminary results. Additionally, this may reduce the need for immunohistochemistry in limited tissue samples, reducing associated costs and turnaround time.

Keywords: artificial intelligence; cytomegalovirus; digital pathology; gastrointestinal pathology; machine learning.

MeSH terms

  • Artificial Intelligence
  • Cytomegalovirus Infections* / diagnosis
  • Cytomegalovirus Infections* / pathology
  • Cytomegalovirus*
  • Eosine Yellowish-(YS)
  • Hematoxylin
  • Humans
  • Machine Learning

Substances

  • Hematoxylin
  • Eosine Yellowish-(YS)