Homologous Recombination Abnormalities Associated With BRCA1/2 Mutations as Predicted by Machine Learning of Targeted Next-Generation Sequencing Data

Breast Cancer (Auckl). 2023 Sep 30:17:11782234231198979. doi: 10.1177/11782234231198979. eCollection 2023.

Abstract

Background: Homologous recombination deficiency (HRD) is the hallmark of breast cancer gene 1/2 (BRCA1/2)-mutated tumors and the unique biomarker for predicting response to double-strand break (DSB)-inducing drugs. The demonstration of HRD in tumors with mutations in genes other than BRCA1/2 is considered the best biomarker of potential response to these DSB-inducer drugs.

Objectives: We explored the potential of developing a practical approach to predict in any tumor the presence of HRD that is similar to that seen in tumors with BRCA1/2 mutations using next-generation sequencing (NGS) along with machine learning (ML).

Design: We use copy number alteration (CNA) generated from routine-targeted NGS data along with a modified naïve Bayesian model for the prediction of the presence of HRD.

Methods: The CNA from NGS of 434 targeted genes was analyzed using CNVkit software to calculate the log2 of CNA changes. The log2 values of various sequencing reads (bins) were used in ML to train the system on predicting tumors with BRCA1/2 mutations and tumors with abnormalities similar to those detected in BRCA1/2 mutations.

Results: Using 31 breast or ovarian cancers with BRCA1/2 mutations and 84 tumors without mutations in any of 12 homologous recombination repair (HRR) genes, the ML demonstrated high sensitivity (90%, 95% confidence interval [CI] = 73%-97.5%) and specificity (98%, 95% CI = 90%-100%). Testing of 114 tumors with mutations in HRR genes other than BRCA1/2 showed 39% positivity for HRD similar to that seen in BRCA1/2. Testing 213 additional wild-type (WT) cancers showed HRD positivity similar to BRCA1/2 in 32% of cases. Correlation with proportional loss of heterozygosity (LOH) as determined using whole exome sequencing of 51 samples showed 90% (95% CI = 72%-97%) concordance. The approach was also validated in an independent set of 1312 consecutive tumor samples.

Conclusions: These data demonstrate that CNA when combined with ML can reliably predict the presence of BRCA1/2 level HRD with high specificity. Using BRCA1/2 mutant cases as gold standard, this ML can be used to predict HRD in cancers with mutations in other HRR genes as well as in WT tumors.

Keywords: BRCA1; BRCA2; Homologous recombination deficiency; PARP inhibitors; copy number variation; double-strand break; machine learning; next-generation sequencing; prediction; response.