Unifying generative and discriminative learning principles

BMC Bioinformatics. 2010 Feb 22:11:98. doi: 10.1186/1471-2105-11-98.

Abstract

Background: The recognition of functional binding sites in genomic DNA remains one of the fundamental challenges of genome research. During the last decades, a plethora of different and well-adapted models has been developed, but only little attention has been payed to the development of different and similarly well-adapted learning principles. Only recently it was noticed that discriminative learning principles can be superior over generative ones in diverse bioinformatics applications, too.

Results: Here, we propose a generalization of generative and discriminative learning principles containing the maximum likelihood, maximum a posteriori, maximum conditional likelihood, maximum supervised posterior, generative-discriminative trade-off, and penalized generative-discriminative trade-off learning principles as special cases, and we illustrate its efficacy for the recognition of vertebrate transcription factor binding sites.

Conclusions: We find that the proposed learning principle helps to improve the recognition of transcription factor binding sites, enabling better computational approaches for extracting as much information as possible from valuable wet-lab data. We make all implementations available in the open-source library Jstacs so that this learning principle can be easily applied to other classification problems in the field of genome and epigenome analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • DNA / chemistry
  • DNA / metabolism
  • Discriminant Analysis
  • Genome
  • Genomics
  • Information Storage and Retrieval / methods*
  • Likelihood Functions
  • Pattern Recognition, Automated

Substances

  • DNA