A framework for assessing the concordance of molecular typing methods and the true strain phylogeny of Campylobacter jejuni and C. coli using draft genome sequence data

Front Cell Infect Microbiol. 2012 May 1:2:57. doi: 10.3389/fcimb.2012.00057. eCollection 2012.

Abstract

Tracking of sources of sporadic cases of campylobacteriosis remains challenging, as commonly used molecular typing methods have limited ability to unambiguously link genetically related strains. Genomics has become increasingly prominent in the public health response to enteric pathogens as methods enable characterization of pathogens at an unprecedented level of resolution. However, the cost of sequencing and expertise required for bioinformatic analyses remains prohibitive, and these comprehensive analyses are limited to a few priority strains. Although several molecular typing methods are currently widely used for epidemiological analysis of campylobacters, it is not clear how accurately these methods reflect true strain relationships. To address this, we have developed a framework and associated computational tools to rapidly analyze draft genome sequence data for the assessment of molecular typing methods against a "gold standard" based on the phylogenetic analysis of highly conserved core (HCC) genes with high sequence quality. We analyzed 104 publicly available whole genome sequences (WGS) of C. jejuni and C. coli. In addition to in silico determination of multi-locus sequence typing (MLST), flaA, and porA type, as well as comparative genomic fingerprinting (CGF) type, we inferred a "reference" phylogeny based on 389 HCC genes. Molecular typing data were compared to the reference phylogeny for concordance using the adjusted Wallace coefficient (AWC) with confidence intervals. Although MLST targets the sequence variability in core genes and CGF targets insertions/deletions of accessory genes, both methods are based on multi-locus analysis and provided better estimates of true phylogeny than methods based on single loci (porA, flaA). A more comprehensive WGS dataset including additional genetically related strains, both epidemiologically linked and unlinked, will be necessary to more comprehensively assess the performance of subtyping methods for outbreak investigations and surveillance activities. Analyses of the strengths and weaknesses of widely used typing methodologies in inferring true strain relationships will provide guidance in the interpretation of this data for epidemiological purposes.

Keywords: CGF; Campylobacter spp.; MLST; flaA; genome; molecular epidemiology; porA.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Campylobacter Infections / epidemiology
  • Campylobacter Infections / microbiology
  • Campylobacter coli / classification*
  • Campylobacter coli / genetics*
  • Campylobacter coli / isolation & purification
  • Campylobacter jejuni / classification*
  • Campylobacter jejuni / genetics*
  • Campylobacter jejuni / isolation & purification
  • DNA Fingerprinting / methods
  • Genome, Bacterial*
  • Humans
  • Molecular Epidemiology / methods
  • Molecular Typing / methods*
  • Phylogeny*