Background: Short structural variants (SSVs), including insertions/deletions (indels), are common in the human genome and impact disease risk. The role of SSVs in late-onset Alzheimer's disease (LOAD) has been understudied. In this study, we developed a bioinformatics pipeline of SSVs within LOAD-genome-wide association study (GWAS) regions to prioritize regulatory SSVs based on the strength of their predicted effect on transcription factor (TF) binding sites.
Methods: The pipeline utilized publicly available functional genomics data sources including candidate cis-regulatory elements (cCREs) from ENCODE and single-nucleus (sn)RNA-seq data from LOAD patient samples.
Results: We catalogued 1581 SSVs in candidate cCREs in LOAD GWAS regions that disrupted 737 TF sites. That included SSVs that disrupted the binding of RUNX3, SPI1, and SMAD3, within the APOE-TOMM40, SPI1, and MS4A6A LOAD regions.
Conclusions: The pipeline developed here prioritized non-coding SSVs in cCREs and characterized their putative effects on TF binding. The approach integrates multiomics datasets for validation experiments using disease models.
© 2023 the Alzheimer's Association.