In silico design of DNA sequences for in vivo nucleosome positioning

Nucleic Acids Res. 2024 Jul 8;52(12):6802-6810. doi: 10.1093/nar/gkae468.

Abstract

The computational design of synthetic DNA sequences with designer in vivo properties is gaining traction in the field of synthetic genomics. We propose here a computational method which combines a kinetic Monte Carlo framework with a deep mutational screening based on deep learning predictions. We apply our method to build regular nucleosome arrays with tailored nucleosomal repeat lengths (NRL) in yeast. Our design was validated in vivo by successfully engineering and integrating thousands of kilobases long tandem arrays of computationally optimized sequences which could accommodate NRLs much larger than the yeast natural NRL (namely 197 and 237 bp, compared to the natural NRL of ∼165 bp). RNA-seq results show that transcription of the arrays can occur but is not driven by the NRL. The computational method proposed here delineates the key sequence rules for nucleosome positioning in yeast and should be easily applicable to other sequence properties and other genomes.

MeSH terms

  • Base Sequence
  • Chromatin Assembly and Disassembly
  • Computer Simulation
  • DNA / chemistry
  • DNA / genetics
  • DNA / metabolism
  • Deep Learning
  • Monte Carlo Method
  • Nucleosomes* / chemistry
  • Nucleosomes* / genetics
  • Nucleosomes* / metabolism
  • Saccharomyces cerevisiae* / genetics

Substances

  • Nucleosomes
  • DNA