The analysis of DNA methylation (DNAm) levels at specific CpG sites represents one of the most promising molecular techniques for estimating an individual's age. To date, a considerable number of studies have reported the development of age prediction models on the basis of DNAm in body fluids, with only a few utilizing buccal swabs. The objective of this study was to identify age-dependent methylation CpG sites in three different genes (HOXC4, TRIM59, and ELOVL2) in buccal swab samples from the Chinese Han population. A total of 461 buccal swabs, with an age range of 0.4-80.8 years, were divided into a training set (n = 325) and a validation set (n = 136). Samples were analyzed by pyrosequencing in order to identify age-related genes with correlation coefficient. A random forest regression model was ultimately proposed, including eight CpGs in three genes, with a mean absolute error (MAE) of 2.119 years. The model performs independent validation set with an MAE of 4.391 years. Our findings illustrate that buccal swabs present a suitable alternative to biological traces for age prediction based on DNAm pattern using pyrosequencing and random forest regression, offering the additional advantage of being collected noninvasively.
Keywords: DNA methylation; age estimation; buccal swab samples; pyrosequencing; random forest regression.
© 2024 Wiley‐VCH GmbH.