Cytometry masked autoencoder: An accurate and interpretable automated immunophenotyper

Jaesik Kim; Matei Ionita; Matthew Lee; Michelle L McKeague; Ajinkya Pattekar; Mark M Painter; Joost Wagenaar; Van Truong; Dylan T Norton; Divij Mathew; Yonghyun Nam; Sokratis A Apostolidis; Cynthia Clendenin; Patryk Orzechowski; Sang-Hyuk Jung; Jakob Woerner; Caroline A G Ittner; Alexandra P Turner; Mika Esperanza; Thomas G Dunn; Nilam S Mangalmurti; John P Reilly; Nuala J Meyer; Carolyn S Calfee; Kathleen D Liu; Michael A Matthy; Lamorna Brown Swigart; Ellen L Burnham; Jeffrey McKeehan; Sheetal Gandotra; Derek W Russel; Kevin W Gibbs; Karl W Thomas; Harsh Barot; Allison R Greenplate; E John Wherry; Dokyoon Kim

doi:10.1016/j.xcrm.2024.101808

Cytometry masked autoencoder: An accurate and interpretable automated immunophenotyper

Cell Rep Med. 2024 Nov 4:101808. doi: 10.1016/j.xcrm.2024.101808. Online ahead of print.

Authors

Jaesik Kim¹, Matei Ionita², Matthew Lee³, Michelle L McKeague², Ajinkya Pattekar², Mark M Painter², Joost Wagenaar⁴, Van Truong³, Dylan T Norton², Divij Mathew², Yonghyun Nam³, Sokratis A Apostolidis⁵, Cynthia Clendenin⁶, Patryk Orzechowski⁷, Sang-Hyuk Jung³, Jakob Woerner³, Caroline A G Ittner⁸, Alexandra P Turner⁸, Mika Esperanza⁸, Thomas G Dunn⁹, Nilam S Mangalmurti¹⁰, John P Reilly¹⁰, Nuala J Meyer⁸, Carolyn S Calfee¹¹, Kathleen D Liu¹², Michael A Matthy¹³, Lamorna Brown Swigart¹⁴, Ellen L Burnham¹⁵, Jeffrey McKeehan¹⁵, Sheetal Gandotra¹⁶, Derek W Russel¹⁷, Kevin W Gibbs¹⁸, Karl W Thomas¹⁸, Harsh Barot¹⁹, Allison R Greenplate²⁰, E John Wherry²¹, Dokyoon Kim²²

Affiliations

¹ Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA; Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
² Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
³ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁴ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁵ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Division of Rheumatology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁶ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁷ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Automatics and Robotics, AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Kraków, Poland.
⁸ Division of Pulmonary and Critical Care Medicine, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁹ Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
¹⁰ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Division of Pulmonary and Critical Care Medicine, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
¹¹ Department of Anesthesia and Perioperative Care, University of California, San Francisco, School of Medicine, San Francisco, CA 94143, USA; Division of Pulmonary, Critical Care, Allergy, and Sleep Medicine, University of California, San Francisco, School of Medicine, San Francisco, CA 94143, USA; Cardiovascular Research Institute, Department of Medicine, University of California, San Francisco, School of Medicine, San Francisco, CA 94158, USA.
¹² Division of Nephrology and Critical Care Medicine, University of California, San Francisco, School of Medicine, San Francisco, CA 94143, USA.
¹³ Cardiovascular Research Institute, Department of Medicine, University of California, San Francisco, School of Medicine, San Francisco, CA 94158, USA.
¹⁴ Department of Laboratory Medicine, University of California, San Francisco, School of Medicine, San Francisco, CA 94143, USA.
¹⁵ Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado School of Medicine, Aurora, CO 80045, USA.
¹⁶ Division of Pulmonary, Allergy and Critical Care Medicine, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA.
¹⁷ Division of Pulmonary, Allergy and Critical Care Medicine, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Pulmonary Section, Birmingham Veteran's Affairs Medical Center, Birmingham, AL 35233, USA.
¹⁸ Section on Pulmonary and Critical Care, Allergy, and Immunology, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA.
¹⁹ Section on Hospital Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA.
²⁰ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. Electronic address: [email protected].
²¹ Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Parker Institute for Cancer Immunotherapy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. Electronic address: [email protected].
²² Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA; Institute for Immunology & Immune Health (I3H), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. Electronic address: [email protected].

PMID: 39515318
DOI: 10.1016/j.xcrm.2024.101808

Abstract

Single-cell cytometry data are crucial for understanding the role of the immune system in diseases and responses to treatment. However, traditional methods for annotating cytometry data face challenges in scalability, robustness, and accuracy. We propose a cytometry masked autoencoder (cyMAE), which automates immunophenotyping tasks including cell type annotation. The model upholds user-defined cell type definitions, facilitating interpretability and cross-study comparisons. The training of cyMAE has a self-supervised phase, which leverages large amounts of unlabeled data, followed by fine-tuning on specialized tasks using smaller amounts of annotated data. The cost of training a new model is amortized over repeated inferences on new datasets using the same panel. Through validation across multiple studies using the same panel, we demonstrate that cyMAE delivers accurate and interpretable cellular immunophenotyping and improves the prediction of subject-level metadata. This proof of concept marks a significant step forward for large-scale immunology studies.

Keywords: automated gating; deep learning; high-dimensional cytometry; immunophenotyping; machine learning; mass cytometry; representation learning.