Purpose: To evaluate the clinical performance of a Protocol Recommendation System (PRS) automatic protocolling of chest CT imaging requests. Materials and Methods: 322 387 consecutive historical imaging requests for chest CT between 2017 and 2022 were extracted from a radiology information system (RIS) database containing 16 associated patient information values. Records with missing fields and protocols with <100 occurrences were removed, leaving 18 protocols for training. After freetext pre-processing and applying CLEVER terminology word replacements, the features of a bag-of-words model were used to train a multinomial logistic regression classifier. Four readers protocolled 300 clinically executed protocols (CEP) based on all clinically available information. After their selection was made, the PRS and CEP were unblinded, and the readers were asked to score their agreement (1 = severe error, 2 = moderate error, 3 = disagreement but acceptable, 4 = agreement). The ground truth was established by the readers' majority selection, a judge helped break ties. For the PRS and CEP, the accuracy and clinical acceptability (scores 3 and 4) were calculated. The readers' protocolling reliability was measured using Fleiss' Kappa. Results: Four readers agreed on 203/300 protocols, 3 on 82/300 cases, and in 15 cases, a judge was needed. PRS errors were found by the 4 readers in 1%, 2.7%, 1%, and 0.7% of the cases, respectively. The accuracy/clinical acceptability of the PRS and CEP were 84.3%/98.6% and 83.0%/99.3%, respectively. The Fleiss' Kappa for all readers and all protocols was 0.805. Conclusion: The PRS achieved similar accuracy to human performance and may help radiologists master the ever-increasing workload.
Keywords: chest; computed tomography; natural language processing; protocols.