EBIMed--text crunching to gather facts for proteins from Medline

Bioinformatics. 2007 Jan 15;23(2):e237-44. doi: 10.1093/bioinformatics/btl302.

Abstract

To allow efficient and systematic retrieval of statements from Medline we have developed EBIMed, a service that combines document retrieval with co-occurrence-based analysis of Medline abstracts. Upon keyword query, EBIMed retrieves the abstracts from EMBL-EBI's installation of Medline and filters for sentences that contain biomedical terminology maintained in public bioinformatics resources. The extracted sentences and terminology are used to generate an overview table on proteins, Gene Ontology (GO) annotations, drugs and species used in the same biological context. All terms in retrieved abstracts and extracted sentences are linked to their entries in biomedical databases. We assessed the quality of the identification of terms and relations in the retrieved sentences. More than 90% of the protein names found indeed represented a protein. According to the analysis of four protein-protein pairs from the Wnt pathway we estimated that 37% of the statements containing such a pair mentioned a meaningful interaction and clarified the interaction of Dkk with LRP. We conclude that EBIMed improves access to information where proteins and drugs are involved in the same biological process, e.g. statements with GO annotations of proteins, protein-protein interactions and effects of drugs on proteins.

Availability: Available at http://www.ebi.ac.uk/Rebholz-srv/ebimed

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing / methods*
  • Algorithms
  • Artificial Intelligence
  • Database Management Systems
  • Information Storage and Retrieval / methods*
  • MEDLINE*
  • Natural Language Processing*
  • Proteins / chemistry
  • Proteins / classification*
  • Proteins / genetics
  • Proteins / metabolism
  • Software*
  • Terminology as Topic*

Substances

  • Proteins