Importance: Accurate, timely, and cost-effective methods for staging oropharyngeal cancers are crucial for patient prognosis and treatment decisions, but staging documentation is often inaccurate or incomplete. With the emergence of artificial intelligence in medicine, data abstraction may be associated with reduced costs but increased efficiency and accuracy of cancer staging.
Objective: To evaluate an algorithm using an artificial intelligence engine capable of extracting essential information from medical records of patients with oropharyngeal cancer and assigning tumor, nodal, and metastatic stages according to American Joint Committee on Cancer eighth edition guidelines.
Design, setting, and participants: This retrospective diagnostic study was conducted among a convenience sample of 806 patients with oropharyngeal squamous cell carcinoma. Medical records of patients with oropharyngeal squamous cell carcinomas who presented to a single tertiary care center between January 1, 2010, and August 1, 2020, were reviewed. A ground truth cancer stage dataset and comprehensive staging rule book consisting of 135 rules encompassing p16 status, tumor, and nodal and metastatic stage were developed. Subsequently, 4 distinct models were trained: model T (entity relationship extraction) for anatomical location and invasion state, model S (numerical extraction) for lesion size, model M (sequential classification) for metastasis detection, and a p16 model for p16 status. For validation, results were compared against ground truth established by expert reviewers, and accuracy was reported. Data were analyzed from March to November 2023.
Main outcomes and measures: The accuracy of algorithm cancer stages was compared with ground truth.
Results: Among 806 patients with oropharyngeal cancer (mean [SD] age, 63.6 [10.6] years; 651 males [80.8%]), 421 patients (52.2%) were positive for human papillomavirus. The artificial intelligence engine achieved accuracies of 55.9% (95% CI, 52.5%-59.3%) for tumor, 56.0% (95% CI, 52.5%-59.4%) for nodal, and 87.6% (95% CI, 85.1%-89.7%) for metastatic stages and 92.1% (95% CI, 88.5%-94.6%) for p16 status. Differentiation between localized (stages 1-2) and advanced (stages 3-4) cancers achieved 80.7% (95% CI, 77.8%-83.2%) accuracy.
Conclusion and relevance: This study found that tumor and nodal staging accuracies were fair to good and excellent for metastatic stage and p16 status, with clinical relevance in assigning optimal treatment and reducing toxic effect exposures. Further model refinement and external validation with electronic health records at different institutions are necessary to improve algorithm accuracy and clinical applicability.