Biomedical Literature Mining and Its Components

Methods Mol Biol. 2022:2496:1-16. doi: 10.1007/978-1-0716-2305-3_1.

Abstract

The published biomedical articles are the best source of knowledge to understand the importance of biomedical entities such as disease, drugs, and their role in different patient population groups. The number of biomedical literature available and being published is increasing at an exponential rate with the use of large scale experimental techniques. Manual extraction of such information is becoming extremely difficult because of the huge number of biomedical literature available. Alternatively, text mining approaches receive much interest within biomedicine by providing automatic extraction of such information in more structured format from the unstructured biomedical text. Here, a text mining protocol to extract the patient population information, to identify the disease and drug mentions in PubMed titles and abstracts, and a simple information retrieval approach to retrieve a list of relevant documents for a user query are presented. The text mining protocol presented in this chapter is useful for retrieving information on drugs for patients with a specific disease. The protocol covers three major text mining tasks, namely, information retrieval, information extraction, and knowledge discovery.

Keywords: Information extraction; Information retrieval; Knowledge discovery; Literature mining; Natural language processing; Text mining.

MeSH terms

  • Data Mining* / methods
  • Humans
  • PubMed
  • Publications*