Selecting radiology examination protocol is a repetitive, and time-consuming process. In this paper, we present a deep learning approach to automatically assign protocols to computed tomography examinations, by pre-training a domain-specific BERT model (BERTrad). To handle the high data imbalance across exam protocols, we used a knowledge distillation approach that up-sampled the minority classes through data augmentation. We compared classification performance of the described approach with n-gram models using Support Vector Machine (SVM), Gradient Boosting Machine (GBM), and Random Forest (RF) classifiers, as well as the BERTbase model. SVM, GBM and RF achieved macro-averaged F1 scores of 0.45, 0.45, and 0.6 while BERTbase and BERTrad achieved 0.61 and 0.63. Knowledge distillation boosted performance on the minority classes and achieved an F1 score of 0.66.
©2021 AMIA - All rights reserved.