MRgRT real-time target localization using foundation models for contour point tracking and promptable mask refinement

Phys Med Biol. 2024 Dec 24;70(1). doi: 10.1088/1361-6560/ad9dad.

Abstract

Objective. This study aimed to evaluate two real-time target tracking approaches for magnetic resonance imaging (MRI) guided radiotherapy (MRgRT) based on foundation artificial intelligence models.Approach. The first approach used a point-tracking model that propagates points from a reference contour. The second approach used a video-object-segmentation model, based on segment anything model 2 (SAM2). Both approaches were evaluated and compared against each other, inter-observer variability, and a transformer-based image registration model, TransMorph, with and without patient-specific (PS) fine-tuning. The evaluation was carried out on 2D cine MRI datasets from two institutions, containing scans from 33 patients with 8060 labeled frames, with annotations from 2 to 5 observers per frame, totaling 29179 ground truth segmentations. The segmentations produced were assessed using the Dice similarity coefficient (DSC), 50% and 95% Hausdorff distances (HD50 / HD95), and the Euclidean center distance (ECD).Main results. The results showed that the contour tracking (median DSC0.92±0.04and ECD1.9±1.0 mm) and SAM2-based (median DSC0.93±0.03and ECD1.6±1.1 mm) approaches produced target segmentations comparable or superior to TransMorph w/o PS fine-tuning (median DSC0.91±0.07and ECD2.6±1.4 mm) and slightly inferior to TransMorph w/ PS fine-tuning (median DSC0.94±0.03and ECD1.4±0.8 mm). Between the two novel approaches, the one based on SAM2 performed marginally better at a higher computational cost (inference times 92 ms for contour tracking and 109 ms for SAM2). Both approaches and TransMorph w/ PS fine-tuning exceeded inter-observer variability (median DSC0.90±0.06and ECD1.7±0.7 mm).Significance. This study demonstrates the potential of foundation models to achieve high-quality real-time target tracking in MRgRT, offering performance that matches state-of-the-art methods without requiring PS fine-tuning.

Keywords: MRI-guidance; MRI-linac; deep learning; motion management; respiratory motion.

MeSH terms

  • Humans
  • Image Processing, Computer-Assisted* / methods
  • Magnetic Resonance Imaging*
  • Radiotherapy, Image-Guided* / methods
  • Time Factors