Predicting genome-wide tissue-specific enhancers via combinatorial transcription factor genomic occupancy analysis

FEBS Lett. 2024 Oct 4. doi: 10.1002/1873-3468.15030. Online ahead of print.

Abstract

Enhancers are non-coding cis-regulatory elements crucial for transcriptional regulation. Mutations in enhancers can disrupt gene regulation, leading to disease phenotypes. Identifying enhancers and their tissue-specific activity is challenging due to their lack of stereotyped sequences. This study presents a sequence-based computational model that uses combinatorial transcription factor (TF) genomic occupancy to predict tissue-specific enhancers. Trained on diverse datasets, including ENCODE and Vista enhancer browser data, the model predicted 25 000 forebrain-specific cis-regulatory modules (CRMs) in the human genome. Validation using biochemical features, disease-associated SNPs, and in vivo zebrafish analysis confirmed its effectiveness. This model aids in predicting enhancers lacking well-characterized chromatin features, complementing experimental approaches in tissue-specific enhancer discovery.

Keywords: DNase I hypersensitive sites; cis‐regulatory modules; forebrain; histone modification; transcription factors; zebrafish.