Validation of Claims Algorithms for Progression to Metastatic Cancer in Patients with Breast, Non-small Cell Lung, and Colorectal Cancer

Front Oncol. 2016 Feb 1:6:18. doi: 10.3389/fonc.2016.00018. eCollection 2016.

Abstract

Background: Validated algorithms for identifying progression to metastatic cancer could permit the use of administrative claims databases for research in this area.

Objective: To identify simple algorithms that could accurately detect cancer progression to metastatic breast, non-small cell lung, and colorectal cancer (CRC) using medical and pharmacy claims data.

Methods: Adults with stage I-III breast, non-small cell lung cancer (NSCLC), or CRC in the Geisinger Health System from 2004 to 2011 were selected. Evidence of progression was extracted via manual chart review as the reference standard. In addition to secondary malignancy diagnosis (ICD-9 code for metastases), diagnoses, procedures, and treatments were selected with clinician input as indicators of cancer progression. Random forests models provided variable importance scores. In addition to codes for secondary malignancy, several more complex algorithms were constructed and performance measures calculated.

Results: Among those with breast cancer [17/502 (3.4%) progressed], the performance of a secondary malignancy code was suboptimal [sensitivity: 64.7%; specificity: 86.0%; positive predictive value (PPV): 13.9; negative predictive value (NPV): 98.6%]; requiring malignancy at another site or initiation of immunotherapy increased PPV and specificity but decreased sensitivity. For NSCLC [61/236 (25.8%) progressed], codes for secondary malignancy alone (PPV: 47.4%; NPV: 84.8%; sensitivity: 60.7%; specificity: 76.6%) performed similarly or better than more complex algorithms. For CRC [33/276 (12.0%) progressed], secondary malignancy codes had good specificity (92.7%) and NPV (92.3%) but low sensitivity (42.4%) and PPV (43.8%); an algorithm with change in chemotherapy increased sensitivity but decreased other metrics.

Conclusion: Selected algorithms performed similarly to the presence of a secondary tumor diagnosis code, with low sensitivity/PPV and higher specificity/NPV. Accurate identification of cancer progression likely requires verification through chart review.

Keywords: cancer progression; claims algorithm; metastatic cancer; oncology; random forests.