The objective of the current study is to use comparative and functional genomic analysis to help to understand the biological mechanism mediating the effect of single nucleotide polymorphisms (SNPs) on blood pressure. We mapped 26 585 SNPs that are in linkage disequilibrium with 1071 human blood pressure-associated sentinel SNPs to 9447 syntenic regions in the mouse genome. Approximately 21.8% of the 1071 linkage disequilibrium regions are located at least 10 kb from any protein-coding gene. Approximately 300 blood pressure-associated SNPs are expression quantitative trait loci for a few dozen known blood pressure physiology genes in tissues including specific kidney regions. Blood pressure-associated sentinel SNPs are significantly enriched for expression quantitative trait loci for blood pressure physiology genes compared with randomly selected SNPs (P<0.00023, Fisher exact test). Using a newly developed deep learning method and other methods, we identified SNPs that were predicted to influence the conservation of CTCF (CCCTC-binding factor) binding across cell types, transcription factor binding, mRNA splicing, or secondary structures of RNA including long noncoding RNA. The SNPs were more likely to be located in CTCF-binding regions than what would be expected from the whole genome (P=4.90×10-7, Pearson χ2 test). One example synonymous SNP rs9337951 was predicted to influence the secondary structure of its host mRNA JCAD (junctional cadherin 5 associated) and was experimentally validated to influence JCAD protein expression. These findings provide an extensive comparative and functional genomic resource for developing experiments to test the functional significance of human blood pressure-associated SNPs in human cells and animal models.
Keywords: blood pressure; deep learning; genomics; linkage disequilibrium; single nucleotide polymorphism.