Comparative analysis of microbial genomes to study unique and expanded gene families in Mycobacterium tuberculosis

Infect Genet Evol. 2009 May;9(3):314-21. doi: 10.1016/j.meegid.2007.12.006. Epub 2007 Dec 28.

Abstract

Mycobacterium tuberculosis, the causative agent of tuberculosis, is the leading infectious disease agent, causing millions of deaths annually. The incidence of disease is increasing with the AIDS pandemic, and current vaccines and therapies are not 100% efficient, resulting in the emergence of drug resistance. The ability of the organism to evolve with enhanced pathogenicity appears, at least in part, to be provided by the mechanism of gene duplication. This evolutionary mechanism results in expansion of gene families, thereby providing the organism with extra copies of the gene and thus the opportunity to evolve new functions. This project aims to identify the expanded gene families in M. tuberculosis and investigate the potential contribution of gene duplication events to pathogenicity. Comparative genomics tools were used to compare the proteomes of over 80 pathogenic and non-pathogenic microorganisms, including several mycobacteria, to identify unique proteins and determine the extent of family expansion in M. tuberculosis. We selected proteins from this organism that were either unique to M. tuberculosis and other pathogens or restricted to pathogenic mycobacteria, as well as expanded families in the mycobacteria, for further analysis. Up to half of all M. tuberculosis proteins belong to expanded families, some of which are unique to this organism or the mycobacteria, suggesting that they have a role to play in evolution of these genomes. Although the evolution of M. tuberculosis is thought to be relatively recent, the maintenance of these duplicated families in the genome suggests they have a role to play in the pathogenic lifestyle of the organism.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Bacterial Proteins / chemistry
  • Evolution, Molecular
  • Gene Duplication
  • Genome, Bacterial*
  • Humans
  • Molecular Sequence Data
  • Multigene Family*
  • Mycobacterium tuberculosis / genetics*
  • Sequence Analysis, DNA
  • Sequence Analysis, Protein
  • Tuberculosis, Pulmonary / microbiology*
  • Virulence Factors / genetics

Substances

  • Bacterial Proteins
  • Virulence Factors