Background and objectives: Systematic reviews form the basis of evidence-based medicine, but are expensive and time-consuming to produce. To address this burden, we have developed a literature identification system (Pythia) that combines the query formulation and citation screening steps.
Methods: Pythia incorporates a set of natural-language questions with machine-learning algorithms to rank all PubMed citations based on relevance, returning the 100 top-ranked citations for human screening. The tagged citations are iteratively exploited by Pythia to refine the search and re-rank the citations.
Results: Across seven systematic reviews, the ability of Pythia to identify the relevant citations (sensitivity) ranged from 0.09 to 0.58. The number of abstracts reviewed per relevant abstract number needed to read (NNR) was lower than in the manually screened project in four reviews, higher in two, and had mixed results in one. The reviews that had greater overall sensitivity retrieved more relevant citations in early batches, but retrieval was generally unaffected by other aspects, such as study design, study size, and specific key question.
Conclusion: Due to its low sensitivity, Pythia is not ready for widespread use. Future research should explore ways to encode domain knowledge in query formulation to better enrich the questions used in the search.
Keywords: Abstract screening; Evidence synthesis; Literature identification; Machine learning; Systematic review methods; Text mining.
Copyright © 2022 Elsevier Inc. All rights reserved.