Prioritizing genes for X-linked diseases using population exome data

Hum Mol Genet. 2015 Feb 1;24(3):599-608. doi: 10.1093/hmg/ddu473. Epub 2014 Sep 12.

Abstract

Many new disease genes can be identified through high-throughput sequencing. Yet, variant interpretation for the large amounts of genomic data remains a challenge given variation of uncertain significance and genes that lack disease annotation. As clinically significant disease genes may be subject to negative selection, we developed a prediction method that measures paucity of non-synonymous variation in the human population to infer gene-based pathogenicity. Integrating human exome data of over 6000 individuals from the NHLBI Exome Sequencing Project, we tested the utility of the prediction method based on the ratio of non-synonymous to synonymous substitution rates (dN/dS) on X-chromosome genes. A low dN/dS ratio characterized genes associated with childhood disease and outcome. Furthermore, we identify new candidates for diseases with early mortality and demonstrate intragenic localized patterns of variants that suggest pathogenic hotspots. Our results suggest that intrahuman substitution analysis is a valuable tool to help prioritize novel disease genes in sequence interpretation.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods
  • Databases, Genetic
  • Exome
  • Genes, X-Linked*
  • Genetic Association Studies
  • Genetic Diseases, X-Linked / genetics*
  • Genetic Predisposition to Disease
  • Genetic Variation
  • Genome, Human*
  • Humans
  • Software
  • Transcriptome*