Objective: We have developed and validated an algorithm based on Piedmont hospital discharge abstracts for ascertainment of incident cases of breast, colorectal, and lung cancer.
Study design and setting: The algorithm training and validation sets were based on data from 2000 and 2001, respectively. The validation was carried out at an individual level by linkage of cases identified by the algorithm with cases in the Piedmont Cancer Registry diagnosed in 2001.
Results: The sensitivity of the algorithm was higher for lung cancer (80.8%) than for breast (76.7%) and colorectal (72.4%) cancers. The positive predictive values were 78.7%, 87.9%, and 92.6% for lung, colorectal, and breast cancer, respectively. The high values for colorectal and breast cancers were due to the model's ability to distinguish prevalent from incident cases and to the accuracy of surgery claims for case identification.
Conclusions: Given its moderate sensitivity, this algorithm is not intended to replace cancer registration, but it is a valuable tool to investigate other aspects of cancer surveillance. This method provides a valid study base for timely monitoring cancer practice and related outcomes, geographic and temporal variations, and costs.