Combination of visual and textual similarity retrieval from medical documents

Stud Health Technol Inform. 2009:150:841-5.

Abstract

Medical visual information retrieval has been an active research area over the past ten years as an increasing amount of images are produced digitally and have become available in patient records, scientific literature, and other medical documents. Most visual retrieval systems concentrate on images only, but it has become apparent that the retrieval of similar images alone is of limited interest, and rather the retrieval of similar documents is an important domain. Most medical institutions as well as the World Health Organization (WHO) produce many complex documents. Searching them, including a visual search, can help finding important information and also facilitates the reuse of document content and images. The work described in this paper is based on a proposal of the WHO that produces large amounts of documents from studies but also for training. The majority of these documents are in complex formats such as PDF, Microsoft Word, Excel, or PowerPoint. Goal is to create an information retrieval system that allows easy addition of documents and search by keywords and visual content. For text retrieval, Lucene is used and for image retrieval the GNU Image Finding Tool (GIFT). A Web 2.0 interface allows for an easy upload as well as simple searching.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Audiovisual Aids*
  • Humans
  • Information Storage and Retrieval / methods*
  • Medical Informatics / organization & administration*
  • User-Computer Interface*