Credibility analysis of putative disease-causing genes using bioinformatics

PLoS One. 2013 Jun 5;8(6):e64899. doi: 10.1371/journal.pone.0064899. Print 2013.

Abstract

Background: Genetic studies are challenging in many complex diseases, particularly those with limited diagnostic certainty, low prevalence or of old age. The result is that genes may be reported as disease-causing with varying levels of evidence, and in some cases, the data may be so limited as to be indistinguishable from chance findings. When there are large numbers of such genes, an objective method for ranking the evidence is useful. Using the neurodegenerative and complex disease amyotrophic lateral sclerosis (ALS) as a model, and the disease-specific database ALSoD, the objective is to develop a method using publicly available data to generate a credibility score for putative disease-causing genes.

Methods: Genes with at least one publication suggesting involvement in adult onset familial ALS were collated following an exhaustive literature search. SQL was used to generate a score by extracting information from the publications and combined with a pathogenicity analysis using bioinformatics tools. The resulting score allowed us to rank genes in order of credibility. To validate the method, we compared the objective ranking with a rank generated by ALS genetics experts. Spearman's Rho was used to compare rankings generated by the different methods.

Results: The automated method ranked ALS genes in the following order: TARDBP, FUS, ANG, SPG11, NEFH, OPTN, ALS2, SETX, FIG4, VAPB, DCTN1, TAF15, VCP, DAO. This compared very well to the ranking of ALS genetics experts, with Spearman's Rho of 0.69 (P = 0.009).

Conclusion: We have presented an automated method for scoring the level of evidence for a gene being disease-causing. In developing the method we have used the model disease ALS, but it could equally be applied to any disease in which there is genotypic uncertainty.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Amyotrophic Lateral Sclerosis / genetics
  • Computational Biology
  • Genetic Association Studies*
  • Genetic Predisposition to Disease*
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Reproducibility of Results
  • Statistics, Nonparametric

Grants and funding

The authors are especially grateful for the long-standing and continued funding of this project from the ALS Association and the MND Association of Great Britain and Northern Ireland. They also thank ALS Canada, MNDA Iceland and the ALS Therapy Alliance for support. The research leading to these results has received funding from the European Community 's Health Seventh Framework Programme FP7/2007–2013 under grant agreement number 259867. AA-C receives salary support from the National Institute for Health Research (NIHR) Dementia Biomedical Research Unit at South London and Maudsley NHS Foundation Trust and King's College London. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. Aleks Radunovic, Nigel Leigh, and Ian Gowrie originally conceived ALSoD. ALSoD is a joint project of the World Federation of Neurology (WFN) and European Network for the Cure ALS (ENCALS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.