Identifying disease-associated copy number variations by a doubly penalized regression model

Biometrics. 2018 Dec;74(4):1341-1350. doi: 10.1111/biom.12920. Epub 2018 Jun 12.

Abstract

Copy number variation (CNV) of DNA plays an important role in the development of many diseases. However, due to the irregularity and sparsity of the CNVs, studying the association between CNVs and a disease outcome or a trait can be challenging. Up to now, not many methods have been proposed in the literature for this problem. Most of the current researchers reply on an ad hoc two-stage procedure by first identifying CNVs in each individual genome and then performing an association test using these identified CNVs. This potentially leads to information loss and as a result a lower power to identify disease associated CNVs. In this article, we describe a new method that combines the two steps into a single coherent model to identify the common CNV across patients that are associated with certain diseases. We use a double penalty model to capture CNVs' association with both the intensities and the disease trait. We validate its performance in simulated datasets and a data example on platinum resistance and CNV in ovarian cancer genome.

Keywords: Association study; Copy number variation; Ovarian cancer; Penalized regression model.

MeSH terms

  • Biometry / methods*
  • Computer Simulation / statistics & numerical data*
  • DNA Copy Number Variations*
  • Drug Resistance, Neoplasm
  • Female
  • Genome, Human / genetics
  • Humans
  • Outcome Assessment, Health Care
  • Ovarian Neoplasms* / drug therapy
  • Ovarian Neoplasms* / genetics
  • Platinum / pharmacology
  • Platinum / therapeutic use
  • Regression Analysis*

Substances

  • Platinum