GAMMA: a tool for the rapid identification, classification and annotation of translated gene matches from sequencing data

Bioinformatics. 2022 Jan 3;38(2):546-548. doi: 10.1093/bioinformatics/btab607.

Abstract

Motivation: Tools used to identify genes in microbial sequences using a reference database generally report matches as a percent identity, which can be difficult to interpret in cases with <100% sequence identity, as changes to specific amino acids can have dramatic effects on protein function, such as when they occur in substrate binding regions or enzyme active sites, which in turn can have dramatic effects on phenotypes like antimicrobial resistance or virulence.

Results: Here, we present GAMMA, an open-source tool for Gene Allele Mutation Microbial Assessment, which uses protein coding-level identity to make gene calls from any gene database and generates a classification (e.g. mutant, truncation) and translated annotation (e.g. Y190S mutation, truncation at residue 110) for these calls. GAMMA accurately called antimicrobial resistance genes from a large set of genomes faster than three other tools. It can also be used with any gene database, as we demonstrated by identifying virulence genes in the same genome set. Because of its speed and flexibility, GAMMA can be used to rapidly find and annotate any gene matches of interest in microbial sequencing data.

Availability and implementation: GAMMA is freely available as a Bioconda package (https://bioconda.github.io/recipes/gamma/README.html) and as a command line script (https://github.com/rastanton/GAMMA).

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Alleles
  • Databases, Factual
  • Proteins*
  • Software*

Substances

  • Proteins