Objective: Homologous recombination deficiency (HRD) is the main mechanism of tumorigenesis in some cancers. HRD causes abnormal double-strand break repair, resulting in genomic scars. Some scoring HRD tests have been approved as companion diagnostics of polyadenosine diphosphate-ribose polymerase (PARP) inhibitor treatment. This study aimed to build an HRD prediction model using gene expression data from various cancer types.
Methods: The cancer genome atlas data were used for HRD prediction modeling. A total of 10,567 cases of 33 cancer types were included, and expression data from 5128 out of 20,502 genes were included as predictors. A penalized logistic regression model was chosen as a modeling technique.
Results: The area under the curve of the receiver operating characteristic curve of HRD status prediction was 0.98 for the training set and 0.93 for the test set. The accuracy of HRD status prediction was 0.93 for the training set and 0.88 for the test set.
Conclusions: Our study suggests that the HRD prediction model based on penalized logistic regression using gene expression data can be used to select patients for treatment with PARP inhibitors.
Keywords: Clinical decision rule; homologous recombination deficiency; poly(ADP-ribose) polymerase inhibitor; probability learning; recombinational DNA repair; transcriptome.