Identity-by-descent filtering of exome sequence data for disease-gene identification in autosomal recessive disorders

Bioinformatics. 2011 Mar 15;27(6):829-36. doi: 10.1093/bioinformatics/btr022. Epub 2011 Jan 28.

Abstract

Motivation: Next-generation sequencing and exome-capture technologies are currently revolutionizing the way geneticists screen for disease-causing mutations in rare Mendelian disorders. However, the identification of causal mutations is challenging due to the sheer number of variants that are identified in individual exomes. Although databases such as dbSNP or HapMap can be used to reduce the plethora of candidate genes by filtering out common variants, the remaining set of genes still remains on the order of dozens.

Results: Our algorithm uses a non-homogeneous hidden Markov model that employs local recombination rates to identify chromosomal regions that are identical by descent (IBD = 2) in children of consanguineous or non-consanguineous parents solely based on genotype data of siblings derived from high-throughput sequencing platforms. Using simulated and real exome sequence data, we show that our algorithm is able to reduce the search space for the causative disease gene to a fifth or a tenth of the entire exome.

Availability: An R script and an accompanying tutorial are available at http://compbio.charite.de/index.php/ibd2.html.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Consanguinity
  • Exons
  • Genes, Recessive*
  • Genetic Diseases, Inborn / genetics*
  • Genome, Human*
  • Genome-Wide Association Study / methods*
  • Genotype
  • Haplotypes
  • Humans
  • Inheritance Patterns
  • Markov Chains
  • Models, Genetic
  • Mutation