Comparing the coverage, recall, and precision of searches for 120 systematic reviews in Embase, MEDLINE, and Google Scholar: a prospective study

Syst Rev. 2016 Mar 1:5:39. doi: 10.1186/s13643-016-0215-7.

Abstract

Background: Previously, we reported on the low recall of Google Scholar (GS) for systematic review (SR) searching. Here, we test our conclusions further in a prospective study by comparing the coverage, recall, and precision of SR search strategies previously performed in Embase, MEDLINE, and GS.

Methods: The original search results from Embase and MEDLINE and the first 1000 results of GS for librarian-mediated SR searches were recorded. Once the inclusion-exclusion process for the resulting SR was complete, search results from all three databases were screened for the SR's included references. All three databases were then searched post hoc for included references not found in the original search results.

Results: We checked 4795 included references from 120 SRs against the original search results. Coverage of GS was high (97.2 %) but marginally lower than Embase and MEDLINE combined (97.5 %). MEDLINE on its own achieved 92.3 % coverage. Total recall of Embase/MEDLINE combined was 81.6 % for all included references, compared to GS at 72.8 % and MEDLINE alone at 72.6 %. However, only 46.4 % of the included references were among the downloadable first 1000 references in GS. When examining data for each SR, the traditional databases' recall was better than GS, even when taking into account included references listed beyond the first 1000 search results. Finally, precision of the first 1000 references of GS is comparable to searches in Embase and MEDLINE combined.

Conclusions: Although overall coverage and recall of GS are high for many searches, the database does not achieve full coverage as some researchers found in previous research. Further, being able to view only the first 1000 records in GS severely reduces its recall percentages. If GS would enable the browsing of records beyond the first 1000, its recall would increase but not sufficiently to be used alone in SR searching. Time needed to screen results would also increase considerably. These results support our assertion that neither GS nor one of the other databases investigated, is on its own, an acceptable database to support systematic review searching.

Publication types

  • Comparative Study

MeSH terms

  • Databases, Bibliographic / standards*
  • Humans
  • Information Storage and Retrieval / standards*
  • MEDLINE
  • Prospective Studies
  • Review Literature as Topic*