Unifying generative and discriminative learning principles

Jens Keilwagen; Jan Grau; Stefan Posch; Marc Strickert; Ivo Grosse

doi:10.1186/1471-2105-11-98

Unifying generative and discriminative learning principles

BMC Bioinformatics. 2010 Feb 22:11:98. doi: 10.1186/1471-2105-11-98.

Authors

Jens Keilwagen¹, Jan Grau, Stefan Posch, Marc Strickert, Ivo Grosse

Affiliation

¹ Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany. [email protected]

Abstract

Background: The recognition of functional binding sites in genomic DNA remains one of the fundamental challenges of genome research. During the last decades, a plethora of different and well-adapted models has been developed, but only little attention has been payed to the development of different and similarly well-adapted learning principles. Only recently it was noticed that discriminative learning principles can be superior over generative ones in diverse bioinformatics applications, too.

Results: Here, we propose a generalization of generative and discriminative learning principles containing the maximum likelihood, maximum a posteriori, maximum conditional likelihood, maximum supervised posterior, generative-discriminative trade-off, and penalized generative-discriminative trade-off learning principles as special cases, and we illustrate its efficacy for the recognition of vertebrate transcription factor binding sites.

Conclusions: We find that the proposed learning principle helps to improve the recognition of transcription factor binding sites, enabling better computational approaches for extracting as much information as possible from valuable wet-lab data. We make all implementations available in the open-source library Jstacs so that this learning principle can be easily applied to other classification problems in the field of genome and epigenome analysis.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
DNA / chemistry
DNA / metabolism
Discriminant Analysis
Genome
Genomics
Information Storage and Retrieval / methods*
Likelihood Functions
Pattern Recognition, Automated

Substances

DNA