A StarGAN and transformer-based hybrid classification-regression model for multi-institution VMAT patient-specific quality assurance

Med Phys. 2024 Nov 1. doi: 10.1002/mp.17485. Online ahead of print.

Abstract

Background: The field of artificial intelligence (AI)-based patient-specific quality assurance (PSQA) for volumetric modulated arc therapy (VMAT) faces challenges in terms of developing general models across institutions due to the prevalence of multi-institution data collection and multivariate heterogeneity. Building a general model that is capable of handling diverse multi-institution data is critical for enabling large-scale integration and analysis.

Purpose: This study aims to develop a star generative adversarial network (StarGAN) and transformer-based hybrid classification-regression PSQA framework to address unification of heterogeneous data from different institutions.

Methods: A StarGAN and transformer-based hybrid classification-regression model was developed as a general PSQA framework to predict gamma passing rates (GPRs) and classify quality assurance (QA) results as "Pass" or "Fail" at multiple institutions. A total of 1815 VMAT plans were collected from eight institutions to develop the general PSQA framework and perform clinical commissioning and implementation. Among them, 20 independent clinical plans from each of eight institutions, for a total of 160 plans, were used for the clinical commissioning, and 205 new clinical plans from eight institutions were used for clinical implementation.

Results: For the 3%/3, 3%/2, and 2%/2 mm gamma criteria, the sensitivity of the proposed PSQA framework with pretraining was 90.13%, 92.03%, and 95.84%, respectively, while the specificity was 76.01%, 76.12%, and 85.34%, respectively. The mean absolute errors (MAEs) of the proposed PSQA framework with pretraining were 1.36%, 2.37%, and 3.96%, respectively, while the root-mean-square errors (RMSEs) were 2.31%, 3.89%, and 5.17%, respectively. The results demonstrated visible improvement at multiple institutions. For clinical commissioning, the deviations between the predicted and measured results were all within 3% for 3%/3 and 3%/2 mm at eight institutions. For clinical implementation, all failure plans were correctly identified by the proposed PSQA framework.

Conclusions: The general PSQA framework enables diverse clinical data sources to be handled to achieve enhanced model performance and generalizability, and provides a solution to the unification of heterogeneous data from different institutions to construct robust QA models. This approach can be clinically deployed for VMAT QA.

Keywords: VMAT patient‐specific QA; clinical implementation; hybrid classification‐regression model; multi‐institution modeling.