Evaluate Cutpoints: Adaptable continuous data distribution system for determining survival in Kaplan-Meier estimator

Comput Methods Programs Biomed. 2019 Aug:177:133-139. doi: 10.1016/j.cmpb.2019.05.023. Epub 2019 May 23.

Abstract

Background and objective: Growing evidence of transcriptional and metabolomic differentiation induced many studies which analyze such differentiation in context of outcome of disease progression, treatment or influence of many different factors affecting cellular and tissue metabolism. Particularly, cancer researchers are looking for new biomarkers that can serve as a diagnostic/prognostic factor and its further corresponding relationship regarding clinical effects. As a result of the increasing interest in use of dichotomization of continuous variables involving clinical or epidemiological data (gene expression, biomarkers, biochemical parameters, etc.) there is a large demand for cutoff point determination tools with simultaneous lack of software offering stratification of patients based on continuous and binary variables. Therefore, we developed "Evaluate Cutpoints" application offering wide set of statistical and graphical methods for cutpoint optimization enabling stratification of population into two or three groups.

Methods: Application is based on R language including algorithms of packages such as survival, survMisc, OptimalCutpoints, maxstat, Rolr, ggplot2, GGally and plotly offering Kaplan-Meier plots and ROC curves with cutoff point determination.

Results: All capabilities of Evaluate Cutpoints were illustrated with example analysis of estrogen, progesterone and human epidermal growth factor 2 receptors in breast cancer cohort. Through ROC curve the cutoff points were established for expression of ESR1, PGR and ERBB2 in correlation with their immunohistochemical status (cutoff: 1301.253, 243.35, 11,434.438, respectively; sensitivity: 94%, 85%, 64%, respectively; specificity: 93%, 86%, 91%, respectively). Through disease-free survival analysis we divided patients into two and three groups regarding expression of ESR1, PGR and ERBB2. Example algorithm cutp showed that lowered expression of ESR1 and ERBB2 was more favorable (HR = 2.07, p = 0.0412; HR = 2.79, p = 0.0777, respectively), whereas heightened PGR expression was correlated with better prognosis (HR = 0.192, p = 0.0115).

Conclusions: This work presents application Evaluate Cutpoints that is freely available to download at http://wnbikp.umed.lodz.pl/Evaluate-Cutpoints/. Currently, many softwares are used to split continuous variables such as Cutoff Finder and X-Tile, which offer distinct algorithms. Unlike them, Evaluate Cutpoints allows not only dichotomization of populations into groups according to continuous variables and binary variables, but also stratification into three groups as well as manual selection of cutoff point thus preventing potential loss of information.

Keywords: Disease-free survival; Prognosis; ROC curve; Software; Survival analysis.

MeSH terms

  • Algorithms
  • Breast Neoplasms / diagnosis*
  • Breast Neoplasms / metabolism*
  • Computational Biology / methods*
  • Data Interpretation, Statistical
  • Disease-Free Survival
  • Estrogen Receptor alpha / metabolism
  • Female
  • Humans
  • Immunohistochemistry
  • Kaplan-Meier Estimate
  • Models, Theoretical
  • Prognosis
  • Programming Languages
  • ROC Curve
  • Receptor, ErbB-2 / metabolism
  • Receptors, Progesterone / metabolism
  • Risk Assessment
  • Software
  • Survival Analysis*

Substances

  • ESR1 protein, human
  • Estrogen Receptor alpha
  • Receptors, Progesterone
  • ERBB2 protein, human
  • Receptor, ErbB-2