An important need exists for strategies to perform rigorous objective clinical-task-based evaluation of artificial intelligence (AI) algorithms for nuclear medicine. To address this need, we propose a 4-class framework to evaluate AI algorithms for promise, technical task-specific efficacy, clinical decision making, and postdeployment efficacy. We provide best practices to evaluate AI algorithms for each of these classes. Each class of evaluation yields a claim that provides a descriptive performance of the AI algorithm. Key best practices are tabulated as the RELAINCE (Recommendations for EvaLuation of AI for NuClear medicinE) guidelines. The report was prepared by the Society of Nuclear Medicine and Molecular Imaging AI Task Force Evaluation team, which consisted of nuclear-medicine physicians, physicists, computational imaging scientists, and representatives from industry and regulatory agencies.
Keywords: PET; SPECT; artificial intelligence; best practices; clinical decision making; clinical task; evaluation; generalizability; postdeployment; technical efficacy.
© 2022 by the Society of Nuclear Medicine and Molecular Imaging.