Discriminating between rare benign and pathogenic variation is a key challenge in clinical genetics, particularly as increasing numbers of nonsynonymous single-nucleotide polymorphisms (SNPs) are identified in resequencing studies. Here, we describe an approach for the functional annotation of nonsynonymous variants that identifies functionally important, disease-causing residues across protein families using multiple sequence alignment. We applied the methodology to long QT syndrome (LQT) genes, which cause sudden death, and their paralogues, which largely cause neurological disease. This approach accurately classified known LQT disease-causing variants (positive predictive value = 98.4%) with a better performance than established bioinformatic methods. The analysis also identified 1078 new putative disease loci, which we incorporated along with known variants into a comprehensive and freely accessible long QT resource (http://cardiodb.org/Paralogue_Annotation/), based on newly created Locus Reference Genomic sequences (http://www.lrg-sequence.org/). We propose that paralogous annotation is widely applicable for Mendelian human disease genes.
© 2012 Wiley Periodicals, Inc.