Introduction: A decentralized and multi-platform-compatible molecular diagnostic tool for kidney transplant biopsies could improve the dissemination and exploitation of this technology, increasing its clinical impact. As a first step towards this molecular diagnostic tool, we developed and validated a classifier using the genes of the Banff-Human Organ Transplant (B-HOT) panel extracted from a historical Molecular Microscope® Diagnostic system microarray dataset. Furthermore, we evaluated the discriminative power of the B-HOT panel in a clinical scenario.
Materials and methods: Gene expression data from 1,181 kidney transplant biopsies were used as training data for three random forest models to predict kidney transplant biopsy Banff categories, including non-rejection (NR), antibody-mediated rejection (ABMR), and T-cell-mediated rejection (TCMR). Performance was evaluated using nested cross-validation. The three models used different sets of input features: the first model (B-HOT Model) was trained on only the genes included in the B-HOT panel, the second model (Feature Selection Model) was based on sequential forward feature selection from all available genes, and the third model (B-HOT+ Model) was based on the combination of the two models, i.e. B-HOT panel genes plus highly predictive genes from the sequential forward feature selection. After performance assessment on cross-validation, the best-performing model was validated on an external independent dataset based on a different microarray version.
Results: The best performances were achieved by the B-HOT+ Model, a multilabel random forest model trained on B-HOT panel genes with the addition of the 6 most predictive genes of the Feature Selection Model (ST7, KLRC4-KLRK1, TRBC1, TRBV6-5, TRBV19, and ZFX), with a mean accuracy of 92.1% during cross-validation. On the validation set, the same model achieved Area Under the ROC Curve (AUC) of 0.965 and 0.982 for NR and ABMR respectively.
Discussion: This kidney transplant biopsy classifier is one step closer to the development of a decentralized kidney transplant biopsy classifier that is effective on data derived from different gene expression platforms. The B-HOT panel proved to be a reliable highly-predictive panel for kidney transplant rejection classification. Furthermore, we propose to include the aforementioned 6 genes in the B-HOT panel for further optimization of this commercially available panel.
Keywords: bioinformatics; diagnosis; gene expression; graft rejection; kidney transplantation; machine learning; pathology; transcriptomics.
Copyright © 2022 van Baardwijk, Cristoferi, Ju, Varol, Minnee, Reinders, Li, Stubbs and Clahsen-van Groningen.