Automatic construction of knowledge base from biological papers

Proc Int Conf Intell Syst Mol Biol. 1997:5:218-25.

Abstract

We designed a system that acquires domain specific knowledge from human written biological papers, and we call this system IFBP (Information Finding from Biological Papers). IFBP is divided into three phases, Information Retrieval (IR), Information Extraction (IE) and Dictionary Construction (DC). We propose a query modification method using automatically constructed thesaurus for IR and a statistical keyword prediction method for IE. A dictionary of domain specific terms, which is one of the central knowledge sources for the task of knowledge acquisition, is also constructed automatically in the DC phase. IFBP is currently used for constructing the Transcription Factor DataBase (TFDB) and shows good performance. Since the model of knowledge base construction that is adopted into IFBP is carried out entirely automatically, this system can be easily ported across domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Bayes Theorem
  • Biology*
  • Cluster Analysis
  • Computer Communication Networks
  • Dictionaries as Topic
  • Humans
  • Information Storage and Retrieval
  • Models, Theoretical
  • Publications*
  • Vocabulary, Controlled