Evaluation of microarray-based DNA methylation measurement using technical replicates: the Atherosclerosis Risk In Communities (ARIC) Study

BMC Bioinformatics. 2014 Sep 19;15(1):312. doi: 10.1186/1471-2105-15-312.

Abstract

Background: DNA methylation is a widely studied epigenetic phenomenon; alterations in methylation patterns influence human phenotypes and risk of disease. As part of the Atherosclerosis Risk in Communities (ARIC) study, the Illumina Infinium HumanMethylation450 (HM450) BeadChip was used to measure DNA methylation in peripheral blood obtained from ~3000 African American study participants. Over 480,000 cytosine-guanine (CpG) dinucleotide sites were surveyed on the HM450 BeadChip. To evaluate the impact of technical variation, 265 technical replicates from 130 participants were included in the study.

Results: For each CpG site, we calculated the intraclass correlation coefficient (ICC) to compare variation of methylation levels within- and between-replicate pairs, ranging between 0 and 1. We modeled the distribution of ICC as a mixture of censored or truncated normal and normal distributions using an EM algorithm. The CpG sites were clustered into low- and high-reliability groups, according to the calculated posterior probabilities. We also demonstrated the performance of this clustering when applied to a study of association between methylation levels and smoking status of individuals. For the CpG sites showing genome-wide significant association with smoking status, most (~96%) were seen from sites in the high reliability cluster.

Conclusions: We suggest that CpG sites with low ICC may be excluded from subsequent association analyses, or extra caution needs to be taken for associations at such sites.

Publication types

  • Evaluation Study
  • Research Support, American Recovery and Reinvestment Act
  • Research Support, N.I.H., Extramural

MeSH terms

  • Atherosclerosis / genetics*
  • Cluster Analysis
  • CpG Islands / genetics
  • DNA Methylation*
  • Epigenomics / methods*
  • Female
  • Genetic Predisposition to Disease / genetics*
  • Humans
  • Male
  • Middle Aged
  • Oligonucleotide Array Sequence Analysis*
  • Reproducibility of Results
  • Residence Characteristics*