Supervised multi-view canonical correlation analysis (sMVCCA): integrating histologic and proteomic features for predicting recurrent prostate cancer

IEEE Trans Med Imaging. 2015 Jan;34(1):284-97. doi: 10.1109/TMI.2014.2355175. Epub 2014 Sep 5.

Abstract

In this work, we present a new methodology to facilitate prediction of recurrent prostate cancer (CaP) following radical prostatectomy (RP) via the integration of quantitative image features and protein expression in the excised prostate. Creating a fused predictor from high-dimensional data streams is challenging because the classifier must 1) account for the "curse of dimensionality" problem, which hinders classifier performance when the number of features exceeds the number of patient studies and 2) balance potential mismatches in the number of features across different channels to avoid classifier bias towards channels with more features. Our new data integration methodology, supervised Multi-view Canonical Correlation Analysis (sMVCCA), aims to integrate infinite views of highdimensional data to provide more amenable data representations for disease classification. Additionally, we demonstrate sMVCCA using Spearman's rank correlation which, unlike Pearson's correlation, can account for nonlinear correlations and outliers. Forty CaP patients with pathological Gleason scores 6-8 were considered for this study. 21 of these men revealed biochemical recurrence (BCR) following RP, while 19 did not. For each patient, 189 quantitative histomorphometric attributes and 650 protein expression levels were extracted from the primary tumor nodule. The fused histomorphometric/proteomic representation via sMVCCA combined with a random forest classifier predicted BCR with a mean AUC of 0.74 and a maximum AUC of 0.9286. We found sMVCCA to perform statistically significantly (p < 0.05) better than comparative state-of-the-art data fusion strategies for predicting BCR. Furthermore, Kaplan-Meier analysis demonstrated improved BCR-free survival prediction for the sMVCCA-fused classifier as compared to histology or proteomic features alone.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / analysis
  • Biomarkers, Tumor / chemistry*
  • Computational Biology
  • Histocytochemistry / methods
  • Humans
  • Image Processing, Computer-Assisted
  • Kaplan-Meier Estimate
  • Male
  • Prognosis
  • Prostatic Neoplasms / diagnosis*
  • Prostatic Neoplasms / metabolism*
  • Prostatic Neoplasms / mortality
  • Prostatic Neoplasms / pathology
  • Proteome / analysis*
  • Proteome / chemistry
  • Proteomics / methods*

Substances

  • Biomarkers, Tumor
  • Proteome