Automatic construction of knowledge base from biological papers

Y Ohta; Y Yamamoto; T Okazaki; I Uchiyama; T Takagi

Automatic construction of knowledge base from biological papers

Proc Int Conf Intell Syst Mol Biol. 1997:5:218-25.

Authors

Y Ohta¹, Y Yamamoto, T Okazaki, I Uchiyama, T Takagi

Affiliation

¹ Human Genome Center, University of Tokyo, Japan. [email protected]

PMID: 9322040

Abstract

We designed a system that acquires domain specific knowledge from human written biological papers, and we call this system IFBP (Information Finding from Biological Papers). IFBP is divided into three phases, Information Retrieval (IR), Information Extraction (IE) and Dictionary Construction (DC). We propose a query modification method using automatically constructed thesaurus for IR and a statistical keyword prediction method for IE. A dictionary of domain specific terms, which is one of the central knowledge sources for the task of knowledge acquisition, is also constructed automatically in the DC phase. IFBP is currently used for constructing the Transcription Factor DataBase (TFDB) and shows good performance. Since the model of knowledge base construction that is adopted into IFBP is carried out entirely automatically, this system can be easily ported across domains.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Artificial Intelligence*
Bayes Theorem
Biology*
Cluster Analysis
Computer Communication Networks
Dictionaries as Topic
Humans
Information Storage and Retrieval
Models, Theoretical
Publications*
Vocabulary, Controlled