Objectives: Prostate cancer is the most frequent cancer among men in the US. Histological grading is an important part of the diagnostic evaluation aside from clinical staging and serum PSA. The most commonly used grading system is the one described by Gleason. From a prognostic point of view, it is of considerable interest to know how accurate the needle biopsy Gleason score is in predicting the final score of the radical prostatectomy specimen. From an outcome research point of view, it is important to recognize that a stratification of patients by Gleason score may prove correct in patients undergoing radical prostatectomy, while in patients undergoing radiation or conservative management some of the well-differentiated cancers could actually be moderately and poorly differentiated, and some of the moderately differentiated might be poorly differentiated, thus favoring radical prostatectomy in a direct comparison of treatment efficacy. We aimed to determine (1) whether such undergrading exists, (2) what the magnitude of the bias is, and (3) whether it is common and similar in different institutions.
Materials and methods: We retrospectively reviewed the records of 415 patients who underwent radical prostatectomy in three Dallas area hospitals, excluding patients who received neoadjuvant therapy prior to surgery. Data of Gleason grades and score were collected from the needle biopsy and the radical prostatectomy specimen. Analysis was done using three categorization schemes for mild, moderate and poor differentiation for the three individual hospitals and the entire group.
Results: The most common Gleason score by needle biopsy and prostatectomy was five. 37.2% of all patients had no change in score assignment, while 12.7% were 'overgraded' and 50.1% 'undergraded' by needle biopsy. The most common undergrading was by 1 or 2 score points. Only 23.7% of the category 'well' cancers remained so after surgery. Between 65.0 and 88.4% of the category 'moderate' cancers remained so after surgery. To determine the degree of agreement between needle biopsy and surgery category, kappa statistics were employed. The kappa value ranged from 0.148 to 0.328 for all categories and classification schemes indicating poor reproducibility. Serum prostate-specific antigen was not helpful in predicting Gleason score upgrading.
Conclusions: Independent of the setting, about 50% of all Gleason score assignments made on needle biopsy specimen are revised in the direction of a worse score/category. It is important for clinicians to realize this phenomenon when consulting with patients regarding treatment choices if the grade is taken into consideration. For outcome research purposes, it is important to realize that this introduces a bias into direct comparisons between surgical and nonsurgical (radiation and watchful waiting) series favoring the outcomes of surgical series as the nonsurgical series suffer from a less favorable patient mix.