Textpresso for neuroscience: searching the full text of thousands of neuroscience research papers

Neuroinformatics. 2008 Sep;6(3):195-204. doi: 10.1007/s12021-008-9031-0. Epub 2008 Oct 24.

Abstract

Textpresso is a text-mining system for scientific literature. Its two major features are access to the full text of research papers and the development and use of categories of biological concepts as well as categories that describe or relate objects. A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. Here we describe Textpresso for Neuroscience, part of the core Neuroscience Information Framework (NIF). The Textpresso site currently consists of 67,500 full text papers and 131,300 abstracts. We show that using categories in literature can make a pure keyword query more refined and meaningful. We also show how semantic queries can be formulated with categories only. We explain the build and content of the database and describe the main features of the web pages and the advanced search options. We also give detailed illustrations of the web service developed to provide programmatic access to Textpresso. This web service is used by the NIF interface to access Textpresso. The standalone website of Textpresso for Neuroscience can be accessed at http://www.textpresso.org/neuroscience/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Access to Information
  • Animals
  • Computational Biology / methods*
  • Computational Biology / organization & administration
  • Computational Biology / trends
  • Databases as Topic* / organization & administration
  • Databases as Topic* / trends
  • Humans
  • Information Storage and Retrieval / methods
  • Information Storage and Retrieval / trends
  • Internet / organization & administration
  • Internet / trends
  • Neurosciences / methods*
  • Neurosciences / organization & administration
  • Neurosciences / trends
  • Periodicals as Topic* / trends
  • Publishing / trends
  • Semantics
  • Software