Background: Supervised deep learning in radiology suffers from notorious inherent limitations: 1) It requires large, hand-annotated data sets; (2) It is non-generalizable; and (3) It lacks explainability and intuition. It has recently been proposed that reinforcement learning addresses all three of these limitations. Notable prior work applied deep reinforcement learning to localize brain tumors with radiologist eye tracking points, which limits the state-action space. Here, we generalize Deep Q Learning to a gridworld-based environment so that only the images and image masks are required.
Methods: We trained a Deep [Formula: see text] network on 30 two-dimensional image slices from the BraTS brain tumor database. Each image contained one lesion. We then tested the trained Deep Q network on a separate set of 30 testing set images. For comparison, we also trained and tested a keypoint detection supervised deep learning network on the same set of training/testing images.
Results: Whereas the supervised approach quickly overfit the training data and predictably performed poorly on the testing set (11% accuracy), the Deep [Formula: see text] learning approach showed progressive improved generalizability to the testing set over training time, reaching 70% accuracy.
Conclusion: We have successfully applied reinforcement learning to localize brain tumors on 2D contrast-enhanced MRI brain images. This represents a generalization of recent work to a gridworld setting naturally suitable for analyzing medical images. We have shown that reinforcement learning does not over-fit small training sets, and can generalize to a separate testing set.
Keywords: Brain tumors; Deep reinforcement learning; Gridworld; Localization; Regression; Reinforcement learning.
© 2022. The Author(s).