Detection of runs of homozygosity from whole exome sequencing data: state of the art and perspectives for clinical, population and epidemiological studies

Hum Hered. 2014;77(1-4):63-72. doi: 10.1159/000362412. Epub 2014 Jul 29.

Abstract

Runs of homozygosity (ROH) are sizeable stretches of homozygous genotypes at consecutive polymorphic DNA marker positions, traditionally captured by means of genome-wide single nucleotide polymorphism (SNP) genotyping. With the advent of next-generation sequencing (NGS) technologies, a number of methods initially devised for the analysis of SNP array data (those based on sliding-window algorithms such as PLINK or GERMLINE and graphical tools like HomozygosityMapper) or specifically conceived for NGS data have been adopted for the detection of ROH from whole exome sequencing (WES) data. In the latter group, algorithms for both graphical representation (AgileVariantMapper, HomSI) and computational detection (H(3)M(2)) of WES-derived ROH have been proposed. Here we examine these different approaches and discuss available strategies to implement ROH detection in WES analysis. Among sliding-window algorithms, PLINK appears to be well-suited for the detection of ROH, especially of the long ones. As a method specifically tailored for WES data, H(3)M(2) outperforms existing algorithms especially on short and medium ROH. We conclude that, notwithstanding the irregular distribution of exons, WES data can be used with some approximation for unbiased genome-wide analysis of ROH features, with promising applications to homozygosity mapping of disease genes, comparative analysis of populations and epidemiological studies based on consanguinity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Exome / genetics*
  • Genome, Human / genetics*
  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Homozygote*
  • Humans