Prioritizing Clinically Relevant Copy Number Variation from Genetic Interactions and Gene Function Data

PLoS One. 2015 Oct 5;10(10):e0139656. doi: 10.1371/journal.pone.0139656. eCollection 2015.

Abstract

It is becoming increasingly necessary to develop computerized methods for identifying the few disease-causing variants from hundreds discovered in each individual patient. This problem is especially relevant for Copy Number Variants (CNVs), which can be cheaply interrogated via low-cost hybridization arrays commonly used in clinical practice. We present a method to predict the disease relevance of CNVs that combines functional context and clinical phenotype to discover clinically harmful CNVs (and likely causative genes) in patients with a variety of phenotypes. We compare several feature and gene weighing systems for classifying both genes and CNVs. We combined the best performing methodologies and parameters on over 2,500 Agilent CGH 180k Microarray CNVs derived from 140 patients. Our method achieved an F-score of 91.59%, with 87.08% precision and 97.00% recall. Our methods are freely available at https://github.com/compbio-UofT/cnv-prioritization. Our dataset is included with the supplementary information.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Causality
  • Congenital Abnormalities / genetics
  • Developmental Disabilities / genetics
  • Gene Dosage*
  • Gene Ontology*
  • Gene Regulatory Networks
  • Genetic Association Studies*
  • Genetic Diseases, Inborn / genetics*
  • Genetic Predisposition to Disease
  • Genetic Variation
  • Humans
  • Models, Genetic
  • Mutation
  • Oligonucleotide Array Sequence Analysis

Grants and funding

This research was funded by the Ontario Research Fund (ORF) Genomes To Life (GL2) program.