MCAST: scanning for cis-regulatory motif clusters

Charles E Grant; James Johnson; Timothy L Bailey; William Stafford Noble

doi:10.1093/bioinformatics/btv750

MCAST: scanning for cis-regulatory motif clusters

Bioinformatics. 2016 Apr 15;32(8):1217-9. doi: 10.1093/bioinformatics/btv750. Epub 2015 Dec 24.

Authors

Charles E Grant¹, James Johnson², Timothy L Bailey², William Stafford Noble³

Affiliations

¹ Department of Genome Sciences, University of Washington, Seattle, WA, USA.
² Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia and.
³ Department of Genome Sciences, University of Washington, Seattle, WA, USA, Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA.

Abstract

Precise regulatory control of genes, particularly in eukaryotes, frequently requires the joint action of multiple sequence-specific transcription factors. A cis-regulatory module (CRM) is a genomic locus that is responsible for gene regulation and that contains multiple transcription factor binding sites in close proximity. Given a collection of known transcription factor binding motifs, many bioinformatics methods have been proposed over the past 15 years for identifying within a genomic sequence candidate CRMs consisting of clusters of those motifs.

Results: The MCAST algorithm uses a hidden Markov model with a P-value-based scoring scheme to identify candidate CRMs. Here, we introduce a new version of MCAST that offers improved graphical output, a dynamic background model, statistical confidence estimates based on false discovery rate estimation and, most significantly, the ability to predict CRMs while taking into account epigenomic data such as DNase I sensitivity or histone modification data. We demonstrate the validity of MCAST's statistical confidence estimates and the utility of epigenomic priors in identifying CRMs.

Availability and implementation: MCAST is part of the MEME Suite software toolkit. A web server and source code are available at http://meme-suite.org and http://alternate.meme-suite.org

Contact: [email protected] or [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

Algorithms*
Binding Sites*
Genome
Humans
Regulatory Elements, Transcriptional
Sequence Analysis, DNA*
Software
Transcription Factors

Substances

Transcription Factors

Grants and funding

R01 GM103544/GM/NIGMS NIH HHS/United States