Machine learning uncovers cell identity regulator by histone code

Nat Commun. 2020 Jun 1;11(1):2696. doi: 10.1038/s41467-020-16539-4.

Abstract

Conversion between cell types, e.g., by induced expression of master transcription factors, holds great promise for cellular therapy. Our ability to manipulate cell identity is constrained by incomplete information on cell identity genes (CIGs) and their expression regulation. Here, we develop CEFCIG, an artificial intelligent framework to uncover CIGs and further define their master regulators. On the basis of machine learning, CEFCIG reveals unique histone codes for transcriptional regulation of reported CIGs, and utilizes these codes to predict CIGs and their master regulators with high accuracy. Applying CEFCIG to 1,005 epigenetic profiles, our analysis uncovers the landscape of regulation network for identity genes in individual cell or tissue types. Together, this work provides insights into cell identity regulation, and delivers a powerful technique to facilitate regenerative medicine.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Cells / classification*
  • Cells / cytology
  • Cells / metabolism*
  • Chromatin Immunoprecipitation Sequencing / statistics & numerical data
  • Databases, Genetic / statistics & numerical data
  • Epigenesis, Genetic
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Histone Code*
  • Human Umbilical Vein Endothelial Cells / cytology
  • Human Umbilical Vein Endothelial Cells / metabolism
  • Humans
  • Machine Learning*
  • Phenotype
  • Pluripotent Stem Cells / cytology
  • Pluripotent Stem Cells / metabolism
  • RNA-Seq / statistics & numerical data
  • Regenerative Medicine
  • Transcription Factors / metabolism

Substances

  • Transcription Factors