SciScribe: Automating and contextualizing literature reviews in cardiac surgery

J Thorac Cardiovasc Surg. 2024 Sep 14:S0022-5223(24)00809-2. doi: 10.1016/j.jtcvs.2024.09.014. Online ahead of print.

Abstract

Background: The task of writing structured content reviews and guidelines has grown stronger and more complex. We propose to go beyond search tools and toward curation tools by automating time-consuming and repetitive steps of extracting and organizing information.

Methods: SciScribe is built as an extension of IBM's Deep Search platform, which provides document processing and search capabilities. This platform was used to ingest and search full-content publications from PubMed Central (PMC) and official, structured records from the ClinicalTrials and OpenPayments databases. Author names and NCT numbers, mentioned within the publications, were used to link publications to these official records as context. Search strategies involve traditional keyword-based search as well as natural language question and answering via large language models (LLMs).

Results: SciScribe is a web-based tool that helps accelerate literature reviews through key features: (1) accumulating a personal collection from publication sources, such as PMC or other sources; (2) incorporating contextual information from external databases into the presented papers, promoting a more informed assessment by readers; (3) semantic questioning and answering of documents to quickly assess relevance and hierarchical organization; and (4) semantic question answering for each document within a collection, collated into tables.

Conclusions: Emergent language processing techniques are opening new avenues to accelerate and enhance the literature review process, for which we have demonstrated a use case implementation in cardiac surgery. SciScribe automates and accelerates this process, mitigates errors associated with repetition and fatigue, and contextualizes results by linking relevant external data sources instantaneously.

Keywords: ClinicalTrials; GenAI; OpenPayments; PubMed; contextualization; generative AI; large language model; literature review; literature search.