An approach to comparing accuracies of two FLAIR MR sequences in the detection of multiple sclerosis lesions in the brain in the absence of gold standard

Acad Radiol. 2010 Jun;17(6):686-95. doi: 10.1016/j.acra.2010.01.019.

Abstract

Rationale and objectives: The purpose of this study was to present a new methodology to compare accuracies of two imaging fluid attenuated inversion recovery (FLAIR) magnetic resonance sequences in detection of multiple sclerosis (MS) lesions in the brain in the absence of ground truth, and to determine whether the two sequences, which differed only in echo time (TE), have the same accuracy.

Materials and methods: We acquired FLAIR images at TE(1) = 90 ms and TE(2) = 155 ms from 46 patients with MS (24-69 years old, mean 45.8, 15 males) and 11 healthy volunteers (23-54 years old, mean 37.1, 6 males). Seven experienced neuroradiologists segmented lesions manually on randomly presented corresponding TE(1) and TE(2) images. For every image pair, a "surrogate ground truth" for each TE was generated by applying probability thresholds, ranging from 0.3 to 0.5, to the weighted average of experts' segmentations. Jackknife alternative free-response receiver operating characteristic analysis was used to compare experts' performance on TE(1) and TE(2) images, using successively the TE(1)- and TE(2)-based ground truths.

Results: Supratentorially, there were significant differences in relative accuracy between the two sequences, ranging from 8.4% to 12.1%. In addition, we found a higher ratio of false positives to true positives for the TE(2) sequence using the TE(2) ground truth, compared to the TE(1) equivalent. Infratentorially, differences in the relative accuracy did not reach statistical significance.

Conclusion: The presented methodology may be useful in assessing the value of new clinical imaging protocols or techniques in the context of replacing existing ones, when the absolute ground truth is not available, and in determining changes in disease progression in follow-up studies. Our results suggest that the sequence with shorter TE should be preferred because it generates relatively fewer false positives. The finding is consistent with results of previous computer simulation studies.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Adult
  • Aged
  • Algorithms*
  • Brain / pathology*
  • Humans
  • Image Enhancement / methods*
  • Image Interpretation, Computer-Assisted / methods*
  • Magnetic Resonance Imaging / methods*
  • Male
  • Multiple Sclerosis / pathology*
  • Reproducibility of Results
  • Sensitivity and Specificity