Motivation: Cytometry comprises powerful techniques for analyzing the cell heterogeneity of a biological sample by examining the expression of protein markers. These technologies impact especially the field of oncoimmunology, where cell identification is essential to analyze the tumor microenvironment. Several classification tools have been developed for the annotation of cytometry datasets, which include supervised tools that require a training set as a reference (i.e. reference-based) and semisupervised tools based on the manual definition of a marker table. The latter is closer to the traditional annotation of cytometry data based on manual gating. However, they require the manual definition of a marker table that cannot be extracted automatically in a reference-based fashion. Therefore, we are lacking methods that allow both classification approaches while maintaining the high biological interpretability given by the marker table.
Results: We present a new tool called GateMeClass (Gate Mining and Classification) which overcomes the limitation of the current methods of classification of cytometry data allowing both semisupervised and supervised annotation based on a marker table that can be defined manually or extracted from an external annotated dataset. We measured the accuracy of GateMeClass for annotating three well-established benchmark mass cytometry datasets and one flow cytometry dataset. The performance of GateMeClass is comparable to reference-based methods and marker table-based techniques, offering greater flexibility and rapid execution times.
Availability and implementation: GateMeClass is implemented in R language and is publicly available at https://github.com/simo1c/GateMeClass.
© The Author(s) 2024. Published by Oxford University Press.