Comparison of logistic regression and neural net modeling for prediction of prostate cancer pathologic stage

Robert W Veltri; Manisha Chaudhari; M Craig Miller; Edward C Poole; Gerard J O'Dowd; Alan W Partin

Comparison of logistic regression and neural net modeling for prediction of prostate cancer pathologic stage

Clin Chem. 2002 Oct;48(10):1828-34.

Authors

Robert W Veltri¹, Manisha Chaudhari, M Craig Miller, Edward C Poole, Gerard J O'Dowd, Alan W Partin

Affiliation

¹ Johns Hopkins Hospital, Department of Urology, 600 North Wolfe St., Baltimore, MD 21287, USA. [email protected]

PMID: 12324513

Abstract

Background: Prostate cancer (PCa) pathologic staging remains a challenge for the physician using individual pretreatment variables. We have previously reported that UroScore, a logistic regression (LR)-derived algorithm, can correctly predict organ-confined (OC) disease state with >90% accuracy. This study compares statistical and neural network (NN) approaches to predict PCa stage.

Methods: A subset (756 of 817) of radical prostatectomy patients was assessed: 434 with OC disease, 173 with capsular penetration (NOC-CP), and 149 with metastases (NOC-AD) in the training sample. Additionally, an OC + NOC-CP (n = 607) vs NOC-AD (n = 149) two-outcome model was prepared. Validation sets included 120 or 397 cases not used for modeling. Input variables included clinical and several quantitative biopsy pathology variables. The classification accuracies achieved with a NN with an error back-propagation architecture were compared with those of LR statistical modeling.

Results: We demonstrated >95% detection of OC PCa in three-outcome models, using both computational approaches. For training patient samples that were equally distributed for the three-outcome models, NNs gave a significantly higher overall classification accuracy than the LR approach (40% vs 96%, respectively). In the two-outcome models using either unequal or equal case distribution, the NNs had only a marginal advantage in classification accuracy over LR.

Conclusions: The strength of a mathematics-based disease-outcome model depends on the quality of the input variables, quantity of cases, case sample input distribution, and computational methods of data processing of inputs and outputs. We identified specific advantages for NNs, especially in the prediction of multiple-outcome models, related to the ability to pre- and postprocess inputs and outputs.

MeSH terms

Aged
Humans
Male
Middle Aged
Neural Networks, Computer*
Prostatic Neoplasms / pathology*
Regression Analysis
Sensitivity and Specificity