Predicting genome-wide tissue-specific enhancers via combinatorial transcription factor genomic occupancy analysis

FEBS Lett. 2025 Jan;599(1):100-119. doi: 10.1002/1873-3468.15030. Epub 2024 Oct 4.

Abstract

Enhancers are non-coding cis-regulatory elements crucial for transcriptional regulation. Mutations in enhancers can disrupt gene regulation, leading to disease phenotypes. Identifying enhancers and their tissue-specific activity is challenging due to their lack of stereotyped sequences. This study presents a sequence-based computational model that uses combinatorial transcription factor (TF) genomic occupancy to predict tissue-specific enhancers. Trained on diverse datasets, including ENCODE and Vista enhancer browser data, the model predicted 25 000 forebrain-specific cis-regulatory modules (CRMs) in the human genome. Validation using biochemical features, disease-associated SNPs, and in vivo zebrafish analysis confirmed its effectiveness. This model aids in predicting enhancers lacking well-characterized chromatin features, complementing experimental approaches in tissue-specific enhancer discovery.

Keywords: DNase I hypersensitive sites; cis‐regulatory modules; forebrain; histone modification; transcription factors; zebrafish.

MeSH terms

  • Amino Acid Motifs / genetics
  • Animals
  • Computer Simulation*
  • Gene Expression Regulation
  • Genetic Predisposition to Disease
  • Genome, Human
  • Genomics* / methods
  • Humans
  • Mice
  • Organ Specificity / genetics
  • Prosencephalon* / metabolism
  • Protein Binding
  • Sequence Analysis, DNA
  • Transcription Factors* / chemistry
  • Transcription Factors* / genetics
  • Zebrafish

Substances

  • Transcription Factors