Aggregation of population-based genetic variation over protein domain homologues and its potential use in genetic diagnostics

Hum Mutat. 2017 Nov;38(11):1454-1463. doi: 10.1002/humu.23313. Epub 2017 Aug 31.

Abstract

Whole exomes of patients with a genetic disorder are nowadays routinely sequenced but interpretation of the identified genetic variants remains a major challenge. The increased availability of population-based human genetic variation has given rise to measures of genetic tolerance that have been used, for example, to predict disease-causing genes in neurodevelopmental disorders. Here, we investigated whether combining variant information from homologous protein domains can improve variant interpretation. For this purpose, we developed a framework that maps population variation and known pathogenic mutations onto 2,750 "meta-domains." These meta-domains consist of 30,853 homologous Pfam protein domain instances that cover 36% of all human protein coding sequences. We find that genetic tolerance is consistent across protein domain homologues, and that patterns of genetic tolerance faithfully mimic patterns of evolutionary conservation. Furthermore, for a significant fraction (68%) of the meta-domains high-frequency population variation re-occurs at the same positions across domain homologues more often than expected. In addition, we observe that the presence of pathogenic missense variants at an aligned homologous domain position is often paired with the absence of population variation and vice versa. The use of these meta-domains can improve the interpretation of genetic variation.

Keywords: ExAC; HGMD; Pfam; evolutionary conservation; functional variation; genetic tolerance; meta-domains; pathogenicity; protein domain homology; variant interpretation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptation, Biological / genetics
  • Chromosome Mapping
  • Computational Biology / methods
  • Conserved Sequence
  • Evolution, Molecular
  • Exome
  • Exome Sequencing
  • Gene Ontology
  • Genetic Testing*
  • Genetic Variation*
  • Genetics, Population* / methods
  • Genomics / methods
  • Genotype
  • Humans
  • Protein Domains / genetics*