Aim: To test the reproducibility between two histopathologists of features of Helicobacter pylori gastritis, using the updated Sydney classification.
Methods: 290 dyspeptic Dutch patients with biopsy proven H pylori infection were enrolled in the study. Gastric antral mucosal biopsy specimens were analysed before and after H pylori eradication treatment. The biopsies were scored semi-quantitatively by two histopathologists, according to the updated Sydney classification system. Variables analysed included the density of H pylori infection, the degree of chronic inflammation, inflammatory activity, atrophy, intestinal metaplasia, and surface epithelial damage. Before grading biopsy specimens, both pathologists reached a consensus on the scoring of gastritis through interactive sessions using a multiheaded microscope. Subsequently all biopsy specimens were graded. Interobserver variability was also analysed using weighted kappa scores.
Results: For interobserver agreement on scoring the various gastritis features a high degree of reproducibility was reached overall. Agreement on grading of atrophy was the lowest; however, moderate to good reproducibility was achieved, with weighted kappa values of 0.49 in the pretreatment biopsies and 0.52 in the post-treatment biopsies. Disagreement was most common in biopsy specimens with lesser degrees of atrophy. A high degree of agreement was obtained for intestinal metaplasia, with weighted kappa values of 0.72 in the pretreatment biopsies and 0.73 in the post-treatment biopsies. The best agreement was reached in the assessment of the density of H pylori both before and after H pylori eradication treatment, with excellent weighted kappa values of 0.76 and 0.95, respectively. The grade of reproducibility of inflammatory activity, superficial epithelial damage, and chronic inflammation was high, with weighted kappa values varying from 0.60 to 0.76 and 0.62 to 0.83 before and after eradication, respectively.
Conclusions: Reproducibility of grading H pylori related gastritis is high using the updated Sydney system. Despite the novel criteria for scoring atrophy, there was imperfect agreement on this feature between two independent histopathologists.