Objective: Here, we examined the accuracy measures of a set of automated deduplication tools to identify duplicate in the eligibility process of systematic reviews.
Study design and setting: A planned search strategy was carried out on seven electronic databases until May 31, 2021. Using manual search as the reference standard, we assessed sensibility, specificity, negative predictive value, and positive predictive value (PPV).
Results: Specificity ranged from 0.96 to 1.00. Rayyan, Mendeley, and Systematic Review Accelerator (SRA) presented high sensibility (0.98 [95% CI = 0.94-1.00]; 0.93 [95% CI = 0.88-0.97] and 0.90 [95% CI = 0.84-0.95], respectively), whereas EndNote X9 and Zotero had only fair sensitivity (0.73 [95% CI = 0.65-0.80] and 0.74 [95% CI = 0.66-0.81], respectively). Negative predictive value ranged from 0.99 to 1.00. Mendeley and SRA had good PPV (0.93 [95% CI = 0.88-0.97] and 0.99 [95% CI = 0.96-1.00], respectively). PPV was fair for EndNote X9 (0.61 [95% CI = 0.54-0.69]) and Zotero (0.62 [95% CI = 0.54-0.69]) and poor for Rayyan (0.41 [95% CI = 0.36-0.47]).
Conclusion: Choosing the most suitable tool depends on its interface's characteristics, the algorithm to identify and exclude duplicates, and the transparency of the process. Therefore, Rayyan, Mendeley, and SRA proved to be accurate enough for the systematic reviews' deduplication step.
Keywords: Accuracy; Deduplication; Epidemiological research; Libraries; Nutrition research methodologies; Systematic review.
Copyright © 2022 Elsevier Inc. All rights reserved.