Prediction of rare single-nucleotide causative mutations for muscular diseases in pooled next-generation sequencing experiments

Maria Brigida Ferraro; Marco Savarese; Giuseppina Di Fruscio; Vincenzo Nigro; Mario Rosario Guarracino

doi:10.1089/cmb.2014.0037

Prediction of rare single-nucleotide causative mutations for muscular diseases in pooled next-generation sequencing experiments

J Comput Biol. 2014 Sep;21(9):665-75. doi: 10.1089/cmb.2014.0037. Epub 2014 Jul 16.

Authors

Maria Brigida Ferraro¹, Marco Savarese, Giuseppina Di Fruscio, Vincenzo Nigro, Mario Rosario Guarracino

Affiliation

¹ 1 Department of Statistical Sciences, Sapienza University of Rome , Rome, Italy .

PMID: 25029289
DOI: 10.1089/cmb.2014.0037

Abstract

Next-generation sequencing (NGS) is a new approach for biomedical research, useful for the diagnosis of genetic diseases in extremely heterogeneous conditions. In this work, we describe how data generated by high-throughput NGS experiments can be analyzed to find single nucleotide polymorphisms (SNPs) in DNA samples of patients affected by neuromuscular disorders. In particular, we consider untagged pooled NGS data, where DNA samples of different individuals are combined in a single experiment, still providing information with an uncertainty limited to only two patients. At the moment, only few publications address the problem of SNPs detection in pooled experiments, and existing tools are often inaccurate. We propose a computational procedure consisting of two parts. In the first, data are filtered by means of decision rules. The second phase is based on a supervised classification technique. In the present work, we compare different de facto standard supervised and unsupervised procedures to identify and classify variants potentially related to muscular diseases, and we discuss results in terms of statistical and biological validation.

Keywords: damaging mutations; muscular diseases; next generation sequencing; prediction; single-nucleotide polymorphism.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Amino Acid Substitution
Genetic Association Studies / methods*
High-Throughput Nucleotide Sequencing*
Humans
Muscular Diseases / diagnosis*
Mutation
Polymorphism, Single Nucleotide*
Sequence Analysis, DNA

Grants and funding

TGM11Z06/TI_/Telethon/Italy