Clostridioides difficile is a pathogen often associated with hospital-acquired infection or antimicrobial-induced disease; however, increasing evidence indicates infections can result from community or environmental sources. Most genomic sequencing of C. difficile has focused on clinical strains, although evidence is growing that C. difficile spores are widespread in soil and water in the environment. In this study, we sequenced 38 genomes collected from soil and water isolates in Flagstaff (AZ, USA) and Slovenia in an effort targeted towards environmental surveillance of C. difficile. At the average nucleotide identity (ANI) level, the genomes were divergent to C. difficile at a threshold consistent with different species. A phylogenetic analysis of these divergent genomes together with Clostridioides genomes available in public repositories confirmed the presence of three previously described, cryptic Clostridioides species and added two additional clades. One of the cryptic species (C-III) was almost entirely composed of Arizona and Slovenia genomes, and contained distinct sub-groups from each region (evidenced by SNP and gene-content differences). A comparative genomics analysis identified multiple unique coding sequences per clade, which can serve as markers for subsequent environmental surveys of these cryptic species. Homologues to the C. difficile toxin genes, tcdA and tcdB, were found in cryptic species genomes, although they were not part of the typical pathogenicity locus observed in C. difficile, and in silico PCR suggested that some would not amplify with widely used PCR diagnostic tests. We also identified gene homologues in the binary toxin cluster, including some present on phage and, for what is believed to be the first time, on a plasmid. All isolates were obtained from environmental samples, so the function and disease potential of these toxin homologues is currently unknown. Enzymatic profiles of a subset of cryptic isolates (n=5) demonstrated differences, suggesting that these isolates contain substantial metabolic diversity. Antimicrobial resistance (AMR) was observed across a subset of isolates (n=4), suggesting that AMR mechanisms are intrinsic to the genus, perhaps originating from a shared environmental origin. This study greatly expands our understanding of the genomic diversity of Clostridioides. These results have implications for C. difficile One Health research, for more sensitive C. difficile diagnostics, as well as for understanding the evolutionary history of C. difficile and the development of pathogenesis.
Keywords: Clostridioides difficile; cryptic species; genomics; toxin.