Motivation: Our study aimed to identify biologically relevant transcription factors (TFs) that control the expression of a set of co-expressed or co-regulated genes.
Results: We developed a fully automated pipeline, Motif Over Representation Analysis (MORA), to detect enrichment of known TF binding motifs in any query sequences. MORA performed better than or comparable to five other TF-prediction tools as evaluated using hundreds of differentially expressed gene sets and ChIP-seq datasets derived from known TFs. Additionally, we developed EnsembleTFpredictor to harness the power of multiple TF-prediction tools to provide a list of functional TFs ranked by prediction confidence. When applied to the test datasets, EnsembleTFpredictor not only identified the target TF but also revealed many TFs known to cooperate with the target TF in the corresponding biological systems. MORA and EnsembleTFpredictor have been used in two publications, demonstrating their power in guiding experimental design and in revealing novel biological insights.
Copyright: © 2023 Boyer et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.