Biome representational in silico karyotyping

Genome Res. 2011 Apr;21(4):626-33. doi: 10.1101/gr.115758.110. Epub 2011 Feb 10.

Abstract

Metagenomic characterization of complex biomes remains challenging. Here we describe a modification of digital karyotyping-biome representational in silico karyotyping (BRISK)-as a general technique for analyzing a defined representation of all DNA present in a sample. BRISK utilizes a Type IIB DNA restriction enzyme to create a defined representation of 27-mer DNAs in a sample. Massively parallel sequencing of this representation allows for construction of high-resolution karyotypes and identification of multiple species within a biome. Application to normal human tissue demonstrated linear recovery of tags by chromosome. We apply this technique to the biome of the oral mucosa and find that greater than 25% of recovered DNA is nonhuman. DNA from 41 microbial species could be identified from oral mucosa of two subjects. Of recovered nonhuman sequences, fewer than 30% are currently annotated. We characterized seven prevalent unknown sequences by chromosome walking and find these represent novel microbial sequences including two likely derived from novel phage genomes. Application of BRISK to archival tissue from a nasopharyngeal carcinoma resulted in identification of Epstein-Barr virus infection. These results suggest that BRISK is a powerful technique for the analysis of complex microbiomes and potentially for pathogen discovery.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Carcinoma / diagnosis
  • Carcinoma / genetics
  • Computational Biology
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Karyotyping*
  • Metagenome / genetics*
  • Metagenomics
  • Molecular Sequence Data
  • Mouth Mucosa / metabolism
  • Mouth Mucosa / microbiology

Associated data

  • GENBANK/FI185049
  • GENBANK/FI185051
  • GENBANK/FI185052
  • GENBANK/FI185053
  • GENBANK/FI185054
  • GENBANK/FI185056