Background: Due to the heterogeneity of cancer, identifying differentially methylated (DM) CpG sites between a set of cancer samples and a set of normal samples cannot tell us which patients have methylation aberrations in a particular DM CpG site.
Methods: We firstly showed that the relative methylation-level orderings (RMOs) of CpG sites within individual normal lung tissues are highly stable but widely disrupted in lung adenocarcinoma tissues. This finding provides the basis of using the RankComp algorithm, previously developed for differential gene expression analysis at the individual level, to identify DM CpG sites in each cancer tissue compared with its own normal state. Briefly, through comparing with the highly stable normal RMOs predetermined in a large collection of samples for normal lung tissues, the algorithm finds those CpG sites whose hyper- or hypo-methylations may lead to the disrupted RMOs of CpG site pairs within a disease sample based on Fisher's exact test.
Results: Evaluated in 59 lung adenocarcinoma tissues with paired adjacent normal tissues, RankComp reached an average precision of 94.26% for individual-level DM CpG sites. Then, after identifying DM CpG sites in each of the 539 lung adenocarcinoma samples from TCGA, we found five and 44 CpG sites hypermethylated and hypomethylated in above 90% of the disease samples, respectively. These findings were validated in 140 publicly available and eight additionally measured paired cancer-normal samples. Gene expression analysis revealed that four of the five genes, HOXA9, TAL1, ATP8A2, ENG and SPARCL1, each harboring one of the five frequently hypermethylated CpG sites within its promoters, were also frequently down-regulated in lung adenocarcinoma.
Conclusions: The common DNA methylation aberrations in lung adenocarcinoma tissues may be important for lung adenocarcinoma diagnosis and therapy.
Keywords: DNA methylation; Differentially methylated CpG sites; Lung adenocarcinoma; Relative methylation level orderings.