Comparison of interobserver agreement for different scoring systems for reflux esophagitis: Impact of level of experience

Gastrointest Endosc. 2004 Jul;60(1):44-9. doi: 10.1016/s0016-5107(04)01289-1.

Abstract

Background: The Savary-Miller, the Los Angeles, and the MUSE (metaplasia, ulcer, stricture, erosion) scoring systems have been developed to assess esophageal lesions related to GERD. Interobserver agreement for these systems was compared, with particular reference to the experience of the endoscopist.

Methods: By using videoendoscopes, videotapes were made of the gastroesophageal junction of 60 patients who presented with symptoms suggestive of GERD. The Savary-Miller, the Los Angeles, and the MUSE systems were used to score all video clips by 9 endoscopists who were subgrouped by level of experience (3 levels, 3 endoscopists per level). Agreement was assessed by using weighted kappa statistics (kappa).

Results: The Savary-Miller scoring system revealed moderate agreement for the experienced group (kappa=0.41) but performed poorly when applied by inexperienced raters (kappa=0.16). The Los Angeles system was most reproducible in all subgroups, irrespective of the level of experience (kappa=0.49 to 0.65). The MUSE scoring system was highly similar to the Los Angeles scoring system with respect to erosions and, in addition, allowed assessment of complications of GERD.

Conclusions: The Los Angeles and the MUSE scoring systems are most reliable for the assessment of erosions caused by GERD. Because of low reliability, use of the Savary-Miller scoring system is not recommended. For all scoring systems, interobserver agreement varies with the level of experience in the performance of upper endoscopy.

MeSH terms

  • Endoscopy, Gastrointestinal
  • Esophagitis, Peptic / classification*
  • Esophagitis, Peptic / etiology
  • Esophagitis, Peptic / pathology*
  • Gastroesophageal Reflux / classification
  • Gastroesophageal Reflux / complications
  • Gastroesophageal Reflux / pathology
  • Humans
  • Observer Variation
  • Prospective Studies
  • Reproducibility of Results