MedlineR: an open source library in R for Medline literature data mining

Bioinformatics. 2004 Dec 12;20(18):3659-61. doi: 10.1093/bioinformatics/bth404. Epub 2004 Jul 29.

Abstract

Summary: We describe an open source library written in the R programming language for Medline literature data mining. This MedlineR library includes programs to query Medline through the NCBI PubMed database; to construct the co-occurrence matrix; and to visualize the network topology of query terms. The open source nature of this library allows users to extend it freely in the statistical programming language of R. To demonstrate its utility, we have built an application to analyze term-association by using only 10 lines of code. We provide MedlineR as a library foundation for bioinformaticians and statisticians to build more sophisticated literature data mining applications.

Availability: The library is available from http://dbsr.duke.edu/pub/MedlineR.

MeSH terms

  • Abstracting and Indexing / methods
  • Database Management Systems*
  • Gene Expression Regulation / physiology*
  • Information Storage and Retrieval / methods*
  • Internet*
  • Libraries
  • MEDLINE*
  • Models, Biological
  • Natural Language Processing*
  • Oligonucleotide Array Sequence Analysis / methods
  • Pattern Recognition, Automated
  • Periodicals as Topic
  • Proteomics / methods
  • Signal Transduction / physiology*
  • Software
  • User-Computer Interface*