Imaging evaluation of a proposed 3D generative model for MRI to CT translation in the lumbar spine

Spine J. 2023 Nov;23(11):1602-1612. doi: 10.1016/j.spinee.2023.06.399. Epub 2023 Jul 20.

Abstract

Background context: A computed tomography (CT) and magnetic resonance imaging (MRI) are used routinely in the radiologic evaluation and surgical planning of patients with lumbar spine pathology, with the modalities being complimentary. We have developed a deep learning algorithm which can produce 3D lumbar spine CT images from MRI data alone. This has the potential to reduce radiation to the patient as well as burden on the health care system.

Purpose: The purpose of this study is to evaluate the accuracy of the synthetic lumbar spine CT images produced using our deep learning model.

Study design: A training set of 400 unpaired CTs and 400 unpaired MRI scans of the lumbar spine was used to train a supervised 3D cycle-Gan model. Evaluators performed a set of clinically relevant measurements on 20 matched synthetic CTs and true CTs. These measurements were then compared to assess the accuracy of the synthetic CTs.

Patient sample: The evaluation data set consisted of 20 patients who had CT and MRI scans performed within a 30-day period of each other. All patient data was deidentified. Notable exclusions included artefact from patient motion, metallic implants or any intervention performed in the 30 day intervening period.

Outcome measures: The outcome measured was the mean difference in measurements performed by the group of evaluators between real CT and synthetic CTs in terms of absolute and relative error.

Methods: Data from the 20 MRI scans was supplied to our deep learning model which produced 20 "synthetic CT" scans. This formed the evaluation data set. Four clinical evaluators consisting of neurosurgeons and radiologists performed a set of 24 clinically relevant measurements on matched synthetic CT and true CTs in 20 patients. A test set of measurements were performed prior to commencing data collection to identify any significant interobserver variation in measurement technique.

Results: The measurements performed in the sagittal plane were all within 10% relative error with the majority within 5% relative error. The pedicle measurements performed in the axial plane were considerably less accurate with a relative error of up to 34%.

Conclusions: The computer generated synthetic CTs demonstrated a high level of accuracy for the measurements performed in-plane to the original MRIs used for synthesis. The measurements performed on the axial reconstructed images were less accurate, attributable to the images being synthesized from nonvolumetric routine sagittal T1-weighted MRI sequences. It is hypothesized that if axial sequences or volumetric data were input into the algorithm these measurements would have improved accuracy.

Keywords: Artificial intelligence; Classification; Computed tomography; Deep learning; Generative adversarial network; Radiology; Spine.