Background: The course of disease for patients with idiopathic pulmonary fibrosis (IPF) is highly heterogeneous. Prognostic models rely on demographic and clinical characteristics and are not reproducible. Integrating data from genomic analyses may identify novel prognostic models and provide mechanistic insights into IPF.
Methods: Total RNA of peripheral blood mononuclear cells was subjected to microarray profiling in a training (45 IPF individuals) and two independent validation cohorts (21 IPF/10 controls, and 75 IPF individuals, respectively). To identify a gene set predictive of IPF prognosis, we incorporated genomic, clinical, and outcome data from the training cohort. Predictor genes were selected if all the following criteria were met: 1) Present in a gene co-expression module from Weighted Gene Co-expression Network Analysis (WGCNA) that correlated with pulmonary function (p < 0.05); 2) Differentially expressed between observed "good" vs. "poor" prognosis with fold change (FC) >1.5 and false discovery rate (FDR) < 2%; and 3) Predictive of mortality (p < 0.05) in univariate Cox regression analysis. "Survival risk group prediction" was adopted to construct a functional genomic model that used the IPF prognostic predictor gene set to derive a prognostic index (PI) for each patient into either high or low risk for survival outcomes. Prediction accuracy was assessed with a repeated 10-fold cross-validation algorithm and independently assessed in two validation cohorts through multivariate Cox regression survival analysis.
Results: A set of 118 IPF prognostic predictor genes was used to derive the functional genomic model and PI. In the training cohort, high-risk IPF patients predicted by PI had significantly shorter survival compared to those labeled as low-risk patients (log rank p < 0.001). The prediction accuracy was further validated in two independent cohorts (log rank p < 0.001 and 0.002). Functional pathway analysis revealed that the canonical pathways enriched with the IPF prognostic predictor gene set were involved in T-cell biology, including iCOS, T-cell receptor, and CD28 signaling.
Conclusions: Using supervised and unsupervised analyses, we identified a set of IPF prognostic predictor genes and derived a functional genomic model that predicted high and low-risk IPF patients with high accuracy. This genomic model may complement current prognostic tools to deliver more personalized care for IPF patients.