Machine learning prediction and explanatory models of serious infections in patients with rheumatoid arthritis treated with tofacitinib

Arthritis Res Ther. 2024 Aug 27;26(1):153. doi: 10.1186/s13075-024-03376-9.

Abstract

Background: Patients with rheumatoid arthritis (RA) have an increased risk of developing serious infections (SIs) vs. individuals without RA; efforts to predict SIs in this patient group are ongoing. We assessed the ability of different machine learning modeling approaches to predict SIs using baseline data from the tofacitinib RA clinical trials program.

Methods: This analysis included data from 19 clinical trials (phase 2, n = 10; phase 3, n = 6; phase 3b/4, n = 3). Patients with RA receiving tofacitinib 5 or 10 mg twice daily (BID) were included in the analysis; patients receiving tofacitinib 11 mg once daily were considered as tofacitinib 5 mg BID. All available patient-level baseline variables were extracted. Statistical and machine learning methods (logistic regression, support vector machines with linear kernel, random forest, extreme gradient boosting trees, and boosted trees) were implemented to assess the association of baseline variables with SI (logistic regression only), and to predict SI using selected baseline variables using 5-fold cross-validation. Missing values were handled individually per prediction model.

Results: A total of 8404 patients with RA treated with tofacitinib were eligible for inclusion (15,310 patient-years of total follow-up) of which 473 patients reported SIs. Amongst other baseline factors, age, previous infection, and corticosteroid use were significantly associated with SI. When applying prediction modeling for SI across data from all studies, the area under the receiver operating characteristic (AUROC) curve ranged from 0.656 to 0.739. AUROC values ranged from 0.599 to 0.730 in data from phase 3 and 3b/4 studies, and from 0.563 to 0.643 in data from ORAL Surveillance only.

Conclusions: Baseline factors associated with SIs in the tofacitinib RA clinical trial program were similar to established SI risk factors associated with advanced treatments for RA. Furthermore, while model performance in predicting SI was similar to other published models, this did not meet the threshold for accurate prediction (AUROC > 0.85). Thus, predicting the occurrence of SIs at baseline remains challenging and may be complicated by the changing disease course of RA over time. Inclusion of other patient-associated and healthcare delivery-related factors and harmonization of the duration of studies included in the models may be required to improve prediction.

Trial registration: ClinicalTrials.gov: NCT00147498; NCT00413660; NCT00550446; NCT00603512; NCT00687193; NCT01164579; NCT00976599; NCT01059864; NCT01359150; NCT02147587; NCT00960440; NCT00847613; NCT00814307; NCT00856544; NCT00853385; NCT01039688; NCT02187055; NCT02831855; NCT02092467.

Keywords: Extreme gradient boosted trees; Infectious diseases; Janus kinase inhibitor; Machine learning; Prediction models; Random forest; Rheumatic diseases; Risk stratification; Support vector machines with linear kernel; Treatment safety.

MeSH terms

  • Adult
  • Aged
  • Antirheumatic Agents / adverse effects
  • Antirheumatic Agents / therapeutic use
  • Arthritis, Rheumatoid* / drug therapy
  • Clinical Trials as Topic
  • Female
  • Humans
  • Infections* / chemically induced
  • Infections* / epidemiology
  • Machine Learning*
  • Male
  • Middle Aged
  • Piperidines* / adverse effects
  • Piperidines* / therapeutic use
  • Protein Kinase Inhibitors / adverse effects
  • Protein Kinase Inhibitors / therapeutic use
  • Pyrimidines* / adverse effects
  • Pyrimidines* / therapeutic use
  • Pyrroles* / adverse effects
  • Pyrroles* / therapeutic use

Substances

  • Antirheumatic Agents
  • Piperidines
  • Protein Kinase Inhibitors
  • Pyrimidines
  • Pyrroles
  • tofacitinib

Associated data

  • ClinicalTrials.gov/NCT00853385
  • ClinicalTrials.gov/NCT02187055
  • ClinicalTrials.gov/NCT00814307
  • ClinicalTrials.gov/NCT00413660
  • ClinicalTrials.gov/NCT00847613
  • ClinicalTrials.gov/NCT01164579
  • ClinicalTrials.gov/NCT00687193
  • ClinicalTrials.gov/NCT00856544
  • ClinicalTrials.gov/NCT00603512
  • ClinicalTrials.gov/NCT00976599
  • ClinicalTrials.gov/NCT00960440
  • ClinicalTrials.gov/NCT02831855
  • ClinicalTrials.gov/NCT02092467
  • ClinicalTrials.gov/NCT01039688
  • ClinicalTrials.gov/NCT00147498
  • ClinicalTrials.gov/NCT01359150
  • ClinicalTrials.gov/NCT00550446
  • ClinicalTrials.gov/NCT02147587
  • ClinicalTrials.gov/NCT01059864