SACCHARIS v2: Streamlining Prediction of Carbohydrate-Active Enzyme Specificities Within Large Datasets

Methods Mol Biol. 2024:2836:299-330. doi: 10.1007/978-1-0716-4007-4_16.

Abstract

Carbohydrates are chemically and structurally diverse, composed of a wide array of monosaccharides, stereochemical linkages, substituent groups, and intermolecular associations with other biological molecules. A large repertoire of carbohydrate-active enzymes (CAZymes) and enzymatic activities are required to form, dismantle, and metabolize these complex molecules. The software SACCHARIS (Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity) provides a rapid, easy-to-use pipeline for the prediction of potential CAZyme function in new datasets. We have updated SACCHARIS to (i) simplify its installation by re-writing in Python and packaging for Conda; (ii) enhance its usability through a new (optional) interactive GUI; and (iii) enable semi-automated annotation of phylogenetic tree output via a new R package or the commonly-used webserver iTOL. Significantly, SACCHARIS v2 has been developed with high-throughput omics in mind, with pipeline automation geared toward complex (meta)genome and (meta)transcriptome datasets to reveal the total CAZyme content ("CAZome") of an organism or community. Here, we outline the development and use of SACCHARIS v2 to discover and annotate CAZymes and provide insight into complex carbohydrate metabolisms in individual organisms and communities.

Keywords: Bioinformatics; Carbohydrate-active enzymes; Meta-omics; Molecular phylogeny.

MeSH terms

  • Carbohydrate Metabolism
  • Carbohydrates / chemistry
  • Computational Biology / methods
  • Enzymes / chemistry
  • Enzymes / genetics
  • Enzymes / metabolism
  • Phylogeny
  • Software*
  • Substrate Specificity

Substances

  • Carbohydrates
  • Enzymes