Aligning knowledge concepts to whole slide images for precise histopathology image analysis

Weiqin Zhao; Ziyu Guo; Yinshuang Fan; Yuming Jiang; Maximus C F Yeung; Lequan Yu

doi:10.1038/s41746-024-01411-2

Aligning knowledge concepts to whole slide images for precise histopathology image analysis

NPJ Digit Med. 2024 Dec 30;7(1):383. doi: 10.1038/s41746-024-01411-2.

Authors

Weiqin Zhao^#¹, Ziyu Guo^#¹, Yinshuang Fan¹, Yuming Jiang², Maximus C F Yeung³, Lequan Yu⁴

Affiliations

¹ School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China.
² School of Medicine, Wake Forest University, Winston-Salem, NC, USA.
³ Department of Pathology, The University of Hong Kong, Hong Kong SAR, China. [email protected].
⁴ School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China. [email protected].

^# Contributed equally.

Abstract

Due to the large size and lack of fine-grained annotation, Whole Slide Images (WSIs) analysis is commonly approached as a Multiple Instance Learning (MIL) problem. However, previous studies only learn from training data, posing a stark contrast to how human clinicians teach each other and reason about histopathologic entities and factors. Here, we present a novel knowledge concept-based MIL framework, named ConcepPath, to fill this gap. Specifically, ConcepPath utilizes GPT-4 to induce reliable disease-specific human expert concepts from medical literature and incorporate them with a group of purely learnable concepts to extract complementary knowledge from training data. In ConcepPath, WSIs are aligned to these linguistic knowledge concepts by utilizing the pathology vision-language model as the basic building component. In the application of lung cancer subtyping, breast cancer HER2 scoring, and gastric cancer immunotherapy-sensitive subtyping tasks, ConcepPath significantly outperformed previous SOTA methods, which lacked the guidance of human expert knowledge.