Status quo of annotation of human disease variants

Hanka Venselaar; Franscesca Camilli; Shima Gholizadeh; Marlou Snelleman; Han G Brunner; Gert Vriend

doi:10.1186/1471-2105-14-352

Status quo of annotation of human disease variants

BMC Bioinformatics. 2013 Dec 4:14:352. doi: 10.1186/1471-2105-14-352.

Authors

Hanka Venselaar¹, Franscesca Camilli, Shima Gholizadeh, Marlou Snelleman, Han G Brunner, Gert Vriend

Affiliation

¹ CMBI, NCMLS, Radboud University Nijmegen Medical Centre, Nijmegen, PO Box 9101, Nijmegen, HB 6500, The Netherlands. [email protected].

Abstract

Background: The ever on-going technical developments in Next Generation Sequencing have led to an increase in detected disease related mutations. Many bioinformatics approaches exist to analyse these variants, and of those the methods that use 3D structure information generally outperform those that do not use this information. 3D structure information today is available for about twenty percent of the human exome, and homology modelling can double that fraction. This percentage is rapidly increasing so that we can expect to analyse the majority of all human exome variants in the near future using protein structure information.

Results: We collected a test dataset of well-described mutations in proteins for which 3D-structure information is available. This test dataset was used to analyse the possibilities and the limitations of methods based on sequence information alone, hybrid methods, machine learning based methods, and structure based methods.

Conclusions: Our analysis shows that the use of structural features improves the classification of mutations. This study suggests strategies for future analyses of disease causing mutations, and it suggests which bioinformatics approaches should be developed to make progress in this field.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Artificial Intelligence
Cluster Analysis
Computational Biology / methods*
Conserved Sequence / genetics
Databases, Genetic
Exome / genetics
Genetic Variation*
Genome, Human / genetics
High-Throughput Nucleotide Sequencing / methods
High-Throughput Nucleotide Sequencing / trends
Humans
Molecular Sequence Annotation / methods*
Mutation / genetics
Polymorphism, Single Nucleotide / genetics
Proteins / chemistry
Proteins / genetics*
Sequence Alignment / trends
Sequence Homology, Amino Acid

Substances

Proteins