Using comparative genomics to reorder the human genome sequence into a virtual sheep genome

Genome Biol. 2007;8(7):R152. doi: 10.1186/gb-2007-8-7-r152.

Abstract

Background: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes?

Results: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the human, dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence hits to the cow and dog genomes were also converted to the equivalent human genome coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed. To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion process allowed us to transfer human genes and other genome features to the virtual sheep genome to display on a sheep genome browser.

Conclusion: We demonstrate that limited sequencing of BACs combined with positioning on a well assembled genome and integrating locations from other less well assembled genomes can yield extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are currently limited.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Cattle
  • Chromosomes, Artificial, Bacterial
  • Dogs
  • Gene Library
  • Genome*
  • Genome, Human
  • Genomics*
  • Humans
  • Molecular Sequence Data
  • Physical Chromosome Mapping*
  • Sequence Analysis, DNA
  • Sheep, Domestic / genetics*