GlyGly-CTERM and rhombosortase: a C-terminal protein processing signal in a many-to-one pairing with a rhomboid family intramembrane serine protease

PLoS One. 2011;6(12):e28886. doi: 10.1371/journal.pone.0028886. Epub 2011 Dec 14.

Abstract

The rhomboid family of serine proteases occurs in all domains of life. Its members contain at least six hydrophobic membrane-spanning helices, with an active site serine located deep within the hydrophobic interior of the plasma membrane. The model member GlpG from Escherichia coli is heavily studied through engineered mutant forms, varied model substrates, and multiple X-ray crystal studies, yet its relationship to endogenous substrates is not well understood. Here we describe an apparent membrane anchoring C-terminal homology domain that appears in numerous genera including Shewanella, Vibrio, Acinetobacter, and Ralstonia, but excluding Escherichia and Haemophilus. Individual genomes encode up to thirteen members, usually homologous to each other only in this C-terminal region. The domain's tripartite architecture consists of motif, transmembrane helix, and cluster of basic residues at the protein C-terminus, as also seen with the LPXTG recognition sequence for sortase A and the PEP-CTERM recognition sequence for exosortase. Partial Phylogenetic Profiling identifies a distinctive rhomboid-like protease subfamily almost perfectly co-distributed with this recognition sequence. This protease subfamily and its putative target domain are hereby renamed rhombosortase and GlyGly-CTERM, respectively. The protease and target are encoded by consecutive genes in most genomes with just a single target, but far apart otherwise. The signature motif of the Rhombo-CTERM domain, often SGGS, only partially resembles known cleavage sites of rhomboid protease family model substrates. Some protein families that have several members with C-terminal GlyGly-CTERM domains also have additional members with LPXTG or PEP-CTERM domains instead, suggesting there may be common themes to the post-translational processing of these proteins by three different membrane protein superfamilies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Biocatalysis
  • Catalytic Domain
  • Cell Membrane / enzymology*
  • Cysteine Endopeptidases / chemistry*
  • Cysteine Endopeptidases / metabolism*
  • Evolution, Molecular
  • Genome, Bacterial / genetics
  • Glycylglycine / metabolism*
  • Molecular Sequence Data
  • Myxococcus / enzymology
  • Myxococcus / genetics
  • Phylogeny
  • Protein Sorting Signals*
  • Proteomics
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Serine / metabolism
  • Serine Endopeptidases / metabolism*
  • Shewanella / enzymology
  • Shewanella / genetics

Substances

  • Protein Sorting Signals
  • Glycylglycine
  • Serine
  • Serine Endopeptidases
  • Cysteine Endopeptidases