Escherichia coli CRISPR arrays from early life fecal samples preferentially target prophages

ISME J. 2024 Jan 8;18(1):wrae005. doi: 10.1093/ismejo/wrae005.

Abstract

CRISPR-Cas systems are defense mechanisms against phages and other nucleic acids that invade bacteria and archaea. In Escherichia coli, it is generally accepted that CRISPR-Cas systems are inactive in laboratory conditions due to a transcriptional repressor. In natural isolates, it has been shown that CRISPR arrays remain stable over the years and that most spacer targets (protospacers) remain unknown. Here, we re-examine CRISPR arrays in natural E. coli isolates and investigate viral and bacterial genomes for spacer targets using a bioinformatics approach coupled to a unique biological dataset. We first sequenced the CRISPR1 array of 1769 E. coli isolates from the fecal samples of 639 children obtained during their first year of life. We built a network with edges between isolates that reflect the number of shared spacers. The isolates grouped into 34 modules. A search for matching spacers in bacterial genomes showed that E. coli spacers almost exclusively target prophages. While we found instances of self-targeting spacers, those involving a prophage and a spacer within the same bacterial genome were rare. The extensive search for matching spacers also expanded the library of known E. coli protospacers to 60%. Altogether, these results favor the concept that E. coli's CRISPR-Cas is an antiprophage system and highlight the importance of reconsidering the criteria use to deem CRISPR-Cas systems active.

Keywords: CRISPR; E. coli; bacteriophage; gut; microbiome; phage; phage resistance; virome.

MeSH terms

  • Bacteriophages* / genetics
  • CRISPR-Cas Systems
  • Child
  • Clustered Regularly Interspaced Short Palindromic Repeats
  • Escherichia coli / genetics
  • Genome, Bacterial
  • Humans
  • Prophages* / genetics

Grants and funding