Alzheimer disease (AD) is a complex and progressive neurodegenerative disorder that accounts for the majority of individuals with dementia. Here, we aim to identify causal plasma proteins for AD, shedding light on the etiology of AD. We utilized the latest large-scale plasma proteomic data from the UK Biobank Pharma Proteomics Project (UKB-PPP) and AD genome-wide association study (GWAS) summary data from the International Genomics of Alzheimer's Project (IGAP). Via a robust univariate instrumental variable (IV) regression method, we identified causal proteins through cis-protein quantitative trait loci (pQTLs) and (both cis- and trans-)pQTLs. To further reduce potential false positives due to high linkage disequilibrium (LD) of some pQTLs and high correlations among some proteins, we developed a robust multivariate IV regression method, called two-stage constrained maximum likelihood (MV-2ScML), to distinguish direct and confounding/mediating effects of proteins; some key features of the method include its robustness to invalid IVs and applicability to GWAS summary data. Our work highlights some differences between using cis-pQTLs and trans-pQTLs and critical values of multivariate analysis for fine-mapping causal proteins, providing insights into plasma protein pathways to AD.
Keywords: 2SLS; 2ScML; IV; constrained maximum likelihood; instrumental variable; pleiotropy.
Copyright © 2024 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.