Background: Pigs (Sus scrofa) provide relevant biomedical models to dissect complex diseases due to their anatomical, genetic, and physiological similarities with humans. Aberrant DNA methylation has been linked to many of these diseases and is associated with gene expression; however, the functional similarities and differences between porcine and human DNA methylation patterns are largely unknown.
Methods: DNA and RNA was isolated from eight tissue samples (fat, heart, kidney, liver, lung, lymph node, muscle, and spleen) from the adult female Duroc utilized for the pig genome sequencing project. Reduced representation bisulfite sequencing (RRBS) and RNA-seq were performed on an Illumina HiSeq2000. RRBS reads were aligned using BSseeker2, and only sites with a minimum depth of 10 reads were used for methylation analysis. RNA-seq reads were aligned using Tophat, and expression analysis was performed using Cufflinks. In addition, SNP calling was performed using GATK for targeted control and whole genome sequencing reads for CpG site validation and allelic expression analysis, respectively.
Results: Analysis on the influence of DNA variation in methylation calling revealed a reduced effectiveness of WGS datasets in covering CpG rich regions, as well as the usefulness of a targeted control library for SNP detection. Analysis of over 500,000 CpG sites demonstrated genome wide methylation patterns similar to those observed in humans, including reduced methylation within CpG islands and at transcription start sites (TSS), X chromosome inactivation, and anticorrelation of TSS CpG methylation with gene expression. In addition, a positive correlation between TSS CpG density and expression, and a negative correlation between TSS TpG density and expression were demonstrated. Low but non-random non-CpG methylation (<1%) was also detected in all non-neuronal somatic tissues, with differences in tissue clustering observed based on CpG and non-CpG methylation patterns. Finally, allele specific expression analysis revealed enrichment of genes involved in metabolic and regulatory processes.
Discussion: These results provide transcriptional and DNA methylation datasets for the biomedical community that are directly relatable to current genomic resources. In addition, the correlation between TSS CpG density and expression suggests increased mutation rates at CpG sites play a significant role in adaptive evolution by reducing CpG density at TSS over time, resulting in higher methylation levels in these regions and more permanent changes to lower gene expression. This is proposed to occur predominantly through deamination of 5-methylcytosine to thymidine, resulting in the replacement of CpG with TpG sites in these regions, as indicated by the increased TSS TpG density observed in non-expressed genes, resulting in a negative correlation between expression and TSS TpG density.
Conclusions: This study provides baseline methylation and gene transcription profiles for a healthy adult pig, reports similar patterns to those observed in humans, and supports future porcine studies related to human disease and development. Additionally, the observed reduced CpG and increased TpG density at TSS of lowly expressed genes suggests DNA methylation plays a significant role in adaptive evolution through more permanent changes to lower gene expression.