Background: Gap and stepoff values in the treatment of acetabular fractures are correlated with clinical outcomes. However, the interobserver and intraobserver variability of gap and stepoff measurements for all imaging modalities in the preoperative, intraoperative, and postoperative phase of treatment is unknown. Recently, a standardized CT-based measurement method was introduced, which provided the opportunity to assess the level of variability.
Questions/purposes: (1) In patients with acetabular fractures, what is the interobserver variability in the measurement of the fracture gaps and articular stepoffs determined by each observer to be the maximum one in the weightbearing dome, as measured on pre- and postoperative pelvic radiographs, intraoperative fluoroscopy, and pre- and postoperative CT scans? (2) What is the intraobserver variability in these measurements?
Methods: Sixty patients with a complete subset of pre-, intra- and postoperative high-quality images (CT slices of < 2 mm), representing a variety of fracture types with small and large gaps and/or stepoffs, were included. A total of 196 patients with nonoperative treatment (n = 117), inadequate available imaging (n = 60), skeletal immaturity (n = 16), bilateral fractures (n = 2) or a primary THA (n = 1) were excluded. The maximum gap and stepoff values in the weightbearing dome were digitally measured on pelvic radiographs and CT images by five independent observers. Observers were free to decide which gap and/or stepoff they considered the maximum and then measure these before and after surgery. The observers were two trauma surgeons with more than 5 years of experience in pelvic surgery, two trauma surgeons with less than 5 years of experience in pelvic surgery, and one surgical resident. Additionally, the final intraoperative fluoroscopy images were assessed for the presence of a gap or stepoff in the weightbearing dome. All observers used the same standardized measurement technique and each observer measured the first five patients together with the responsible researcher. For 10 randomly selected patients, all measurements were repeated by all observers, at least 2 weeks after the initial measurements. The intraclass correlation coefficient (ICC) for pelvic radiographs and CT images and the kappa value for intraoperative fluoroscopy measurements were calculated to determine the inter- and intraobserver variability. Interobserver variability was defined as the difference in the measurements between observers. Intraobserver variability was defined as the difference in repeated measurements by the same observer.
Results: Preoperatively, the interobserver ICC was 0.4 (gap and stepoff) on radiographs and 0.4 (gap) and 0.3 (stepoff) on CT images. The observers agreed on the indication for surgery in 40% (gap) and 30% (stepoff) on pelvic radiographs. For CT scans the observers agreed in 95% (gap) and 70% (stepoff) of images. Postoperatively, the interobserver ICC was 0.4 (gap) and 0.2 (stepoff) on radiographs. The observers agreed on whether the reduction was acceptable or not in 60% (gap) and 40% (stepoff). On CT images the ICC was 0.3 (gap) and 0.4 (stepoff). The observers agreed on whether the reduction was acceptable in 35% (gap) and 38% (stepoff). The preoperative intraobserver ICC was 0.6 (gap and stepoff) on pelvic radiographs and 0.4 (gap) and 0.6 (stepoff) for CT scans. Postoperatively, the intraobserver ICC was 0.7 (gap) and 0.1 (stepoff) on pelvic radiographs. On CT the intraobserver ICC was 0.5 (gap) and 0.3 (stepoff). There was no agreement between the observers on the presence of a gap or stepoff on intraoperative fluoroscopy images (kappa -0.1 to 0.2).
Conclusions: We found an insufficient interobserver and intraobserver agreement on measuring gaps and stepoffs for supporting clinical decisions in acetabular fracture surgery. If observers cannot agree on the size of the gap and stepoff, it will be challenging to decide when to perform surgery and study the results of acetabular fracture surgery.
Level of evidence: Level III, diagnostic study.