Self-supervised deep learning for highly efficient spatial immunophenotyping

Hanyun Zhang; Khalid AbdulJabbar; Tami Grunewald; Ayse U Akarca; Yeman Hagos; Faranak Sobhani; Catherine S Y Lecat; Dominic Patel; Lydia Lee; Manuel Rodriguez-Justo; Kwee Yong; Jonathan A Ledermann; John Le Quesne; E Shelley Hwang; Teresa Marafioti; Yinyin Yuan

doi:10.1016/j.ebiom.2023.104769

Self-supervised deep learning for highly efficient spatial immunophenotyping

EBioMedicine. 2023 Sep:95:104769. doi: 10.1016/j.ebiom.2023.104769. Epub 2023 Sep 4.

Authors

Hanyun Zhang¹, Khalid AbdulJabbar¹, Tami Grunewald², Ayse U Akarca³, Yeman Hagos¹, Faranak Sobhani¹, Catherine S Y Lecat⁴, Dominic Patel⁴, Lydia Lee⁴, Manuel Rodriguez-Justo⁴, Kwee Yong⁴, Jonathan A Ledermann², John Le Quesne⁵, E Shelley Hwang⁶, Teresa Marafioti³, Yinyin Yuan⁷

Affiliations

¹ Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK.
² Department of Oncology, UCL Cancer Institute, University College London, London, UK.
³ Department of Cellular Pathology, University College London Hospital, London, UK.
⁴ Research Department of Hematology, Cancer Institute, University College London, UK.
⁵ School of Cancer Sciences, University of Glasgow, Glasgow, UK; CRUK Beatson Institute, Garscube Estate, Glasgow, UK; Department of Histopathology, Queen Elizabeth University Hospital, Glasgow, UK.
⁶ Department of Surgery, Duke University Medical Center, Durham, NC, USA.
⁷ Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK. Electronic address: [email protected].

Abstract

Background: Efficient biomarker discovery and clinical translation depend on the fast and accurate analytical output from crucial technologies such as multiplex imaging. However, reliable cell classification often requires extensive annotations. Label-efficient strategies are urgently needed to reveal diverse cell distribution and spatial interactions in large-scale multiplex datasets.

Methods: This study proposed Self-supervised Learning for Antigen Detection (SANDI) for accurate cell phenotyping while mitigating the annotation burden. The model first learns intrinsic pairwise similarities in unlabelled cell images, followed by a classification step to map learnt features to cell labels using a small set of annotated references. We acquired four multiplex immunohistochemistry datasets and one imaging mass cytometry dataset, comprising 2825 to 15,258 single-cell images to train and test the model.

Findings: With 1% annotations (18-114 cells), SANDI achieved weighted F1-scores ranging from 0.82 to 0.98 across the five datasets, which was comparable to the fully supervised classifier trained on 1828-11,459 annotated cells (-0.002 to -0.053 of averaged weighted F1-score, Wilcoxon rank-sum test, P = 0.31). Leveraging the immune checkpoint markers stained in ovarian cancer slides, SANDI-based cell identification reveals spatial expulsion between PD1-expressing T helper cells and T regulatory cells, suggesting an interplay between PD1 expression and T regulatory cell-mediated immunosuppression.

Interpretation: By striking a fine balance between minimal expert guidance and the power of deep learning to learn similarity within abundant data, SANDI presents new opportunities for efficient, large-scale learning for histology multiplex imaging data.

Funding: This study was funded by the Royal Marsden/ICR National Institute of Health Research Biomedical Research Centre.

Keywords: Cell classification; Deep learning; Imaging mass cytometry; Multiplex imaging; Multiplex immunohistochemistry; Self-supervised learning.

MeSH terms

Biomedical Research*
Deep Learning*
Female
Humans
Immunophenotyping
Immunosuppression Therapy
Ovarian Neoplasms*

Abstract

MeSH terms

Grants and funding