Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.
Design: Pivotal randomized controlled trials (RCTs) in oncology were identified from the pan-Canadian Oncology Drug Review (pCODR) database (primary analysis) and the Food and Drug Administration's (FDA) drug approvals page (secondary analysis) between January 2012 and May 2016. Two reviewers independently reconstructed HRs using each method on KM curves extracted from each trial and compared them with reported HRs (gold standard). Bland-Altman plots and summary statistics were calculated to assess accuracy and precision of these methods. Interrater reliability was assessed using intraclass correlation coefficient (ICC). These four methods were also applied to KM curves of different structures (ie, flat versus steep curves).
Results: A total of 118 KM curves (55 RCTs) and 77 KM curves (46 RCTs) were extracted from pCODR and FDA, respectively. In the primary analysis, the Guyot method was the most accurate with the lowest mean error (0.0094; 95% CI, -0.0012-0.020). All four methods had excellent interrater reliability. The Guyot method showed the smallest bias and greatest precision on the Bland-Altman plots. The Guyot method was consistently superior in both the secondary and all sensitivity analyses.
Conclusion: In the absence of reported HRs, we recommend that researchers consider the Guyot method to reconstruct HRs from KM curves when performing aggregate data meta-analyses.
Keywords: Kaplan-Meier survival curves; hazard ratios; meta-analysis; validation study.
© 2019 John Wiley & Sons, Ltd.