In order to evaluate whether rare regulatory variants in the vicinity of promoters are likely to impact gene expression, we conducted a novel burden test for enrichment of rare variants at the extremes of expression. After sequencing 2-kb promoter regions of 472 genes in 410 healthy adults, we performed a quadratic regression of rare variant count on bins of peripheral blood transcript abundance from microarrays, summing over ranks of all genes. After adjusting for common eQTLs and the major axes of gene expression covariance, a highly significant excess of variants with minor allele frequency less than 0.05 at both high and low extremes across individuals was observed. Further enrichment was seen in sites annotated as potentially regulatory by RegulomeDB, but a deficit of effects was associated with known metabolic disease genes. The main result replicates in an independent sample of 75 individuals with RNA-seq and whole-genome sequence information. Three of four predicted large-effect sites were validated by CRISPR/Cas9 knockdown in K562 cells, but simulations indicate that effect sizes need not be unusually large to produce the observed burden. Unusually divergent low-frequency promoter haplotypes were observed at 31 loci, at least 9 of which appear to be derived from Neandertal admixture, but these were not associated with divergent gene expression in blood. The overall burden test results are consistent with rare and private regulatory variants driving high or low transcription at specific loci, potentially contributing to disease.
Copyright © 2016 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.