Identification of co-occurring transcription factor binding sites from DNA sequence using clustered position weight matrices

Young Min Oh; Jong Kyoung Kim; Seungjin Choi; Joo-Yeon Yoo

doi:10.1093/nar/gkr1252

Identification of co-occurring transcription factor binding sites from DNA sequence using clustered position weight matrices

Nucleic Acids Res. 2012 Mar;40(5):e38. doi: 10.1093/nar/gkr1252. Epub 2011 Dec 19.

Authors

Young Min Oh¹, Jong Kyoung Kim, Seungjin Choi, Joo-Yeon Yoo

Affiliation

¹ Department of Life Sciences, Pohang University of Science and Technology, Pohang, Republic of Korea.

Abstract

Accurate prediction of transcription factor binding sites (TFBSs) is a prerequisite for identifying cis-regulatory modules that underlie transcriptional regulatory circuits encoded in the genome. Here, we present a computational framework for detecting TFBSs, when multiple position weight matrices (PWMs) for a transcription factor are available. Grouping multiple PWMs of a transcription factor (TF) based on their sequence similarity improves the specificity of TFBS prediction, which was evaluated using multiple genome-wide ChIP-Seq data sets from 26 TFs. The Z-scores of the area under a receiver operating characteristic curve (AUC) values of 368 TFs were calculated and used to statistically identify co-occurring regulatory motifs in the TF bound ChIP loci. Motifs that are co-occurring along with the empirical bindings of E2F, JUN or MYC have been evaluated, in the basal or stimulated condition. Results prove our method can be useful to systematically identify the co-occurring motifs of the TF for the given conditions.

Publication types

Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Base Sequence
Binding Sites
Conserved Sequence
E2F Transcription Factors / metabolism
Nucleotide Motifs
Position-Specific Scoring Matrices*
Proto-Oncogene Proteins c-jun / metabolism
Regulatory Elements, Transcriptional*
Sequence Analysis, DNA*
Software
Transcription Factors / metabolism*

Substances

E2F Transcription Factors
Proto-Oncogene Proteins c-jun
Transcription Factors