Chrom-Lasso: a lasso regression-based model to detect functional interactions using Hi-C data

Brief Bioinform. 2021 Nov 5;22(6):bbab181. doi: 10.1093/bib/bbab181.

Abstract

Hi-C is a genome-wide assay based on Chromosome Conformation Capture and high-throughput sequencing to decipher 3D chromatin organization in the nucleus. However, computational methods to detect functional interactions utilizing Hi-C data face challenges including the correction for various sources of biases and the identification of functional interactions with low counts of interacting fragments. We present Chrom-Lasso, a lasso linear regression model that removes complex biases assumption-free and identifies functional interacting loci with increased power by combining information of local reads distribution surrounding the area of interest. We showed that interacting regions identified by Chrom-Lasso are more enriched for 5C validated interactions and functional GWAS hits than that of GOTHiC and Fit-Hi-C. To further demonstrate the ability of Chrom-Lasso to detect interactions of functional importance, we performed time-series Hi-C and RNA-seq during T cell activation and exhaustion. We showed that the dynamic changes in gene expression and chromatin interactions identified by Chrom-Lasso were largely concordant with each other. Finally, we experimentally confirmed Chrom-Lasso's finding that Erbb3 was co-regulated with distinct neighboring genes at different states during T cell activation. Our results highlight Chrom-Lasso's utility in detecting weak functional interaction between cis-regulatory elements, such as promoters and enhancers.

Keywords: 3D genomics; Hi-C data analysis; functional chromatin interactions; lasso regression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • CD8-Positive T-Lymphocytes / immunology
  • CD8-Positive T-Lymphocytes / metabolism
  • Chromatin / chemistry*
  • Chromatin / genetics*
  • Databases, Genetic
  • Epistasis, Genetic
  • Gene Expression Regulation
  • Gene Library
  • Genome-Wide Association Study / methods
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Lymphocyte Activation / genetics
  • Lymphocyte Activation / immunology
  • Mice
  • Models, Molecular*
  • Models, Statistical*
  • Quantitative Trait Loci
  • Regression Analysis*
  • Software*

Substances

  • Chromatin