Objectives: A determination of the inter-observer variability is an important step before determining diagnostic accuracy and requires a specific methodology and statistical tests. The aim of this study was to report the results, characteristics and methodological quality of agreement studies performed in hepatology.
Methods: A search of published studies yielded 42 that could be used in this evaluation: three were clinical studies, 11 were in the field of endoscopy, 12 in histopathology and 16 in radiology. The studies were described with a grid of 28 items and evaluated with a quality score (QUAS; maximum, 35) including 22 items.
Results: The following agreement level was noted: intra-observer > inter-observer > inter-centre. The following signs had good agreement. Endoscopy: size and red signs of oesophageal varices; histopathology: cirrhosis, fibrosis and steatosis; Doppler: mean portal vein and superior mesenteric artery velocities, hepatic artery area and perfusion indexes. Frequent methodological weaknesses were noted. The real agreement (such as the kappa index excluding chance), and the prevalence of signs and biases were rarely assessed. Standardized observations (67% of the studies), blind assessment (48%), simultaneous observations (7%), and the recording technique were not frequently used. The mean QUAS was 13 +/- 6 with 17 +/- 4 in histopathology versus 11 +/- 6 in radiology (P < 0.05). Using multiple regression, four variables independently predicted the QUAS with R2 = 0.77: adapted tests, multiple observations, intra-class correlation coefficient and agreement proportion.
Conclusions: Methodology was often insufficient. Agreement is often measured under biased conditions. Some areas were not or were rarely studied, e.g., biology and ultrasound. The agreement and QUAS were often poor, suggesting the need for studies with improved observation and methodological quality of agreement.