RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset

Guillermo Reales; Elena Vigorito; Martin Kelemen; Chris Wallace

doi:10.1093/bioinformatics/btab456

RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset

Bioinformatics. 2021 Dec 7;37(23):4444-4450. doi: 10.1093/bioinformatics/btab456.

Authors

Guillermo Reales^{1

2}, Elena Vigorito³, Martin Kelemen^{1

4}, Chris Wallace^{1

2

3}

Affiliations

¹ Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0AW, UK.
² Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge CB2 2QQ, UK.
³ MRC Biostatistics Unit, University of Cambridge, School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge CB2 0SR, UK.
⁴ Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1RQ, UK.

Abstract

Motivation: Polygenic scores (PGS) aim to genetically predict complex traits at an individual level. PGS are typically trained on genome-wide association summary statistics and require an independent test dataset to tune parameters. More recent methods allow parameters to be tuned on the training data, removing the need for independent test data, but approaches are computationally intensive. Based on fine-mapping principles, we present RápidoPGS, a flexible and fast method to compute PGS requiring summary-level Genome-wide association studies (GWAS) datasets only, with little computational requirements and no test data required for parameter tuning.

Results: We show that RápidoPGS performs slightly less well than two out of three other widely used PGS methods (LDpred2, PRScs and SBayesR) for case-control datasets, with median r2 difference: -0.0092, -0.0042 and 0.0064, respectively, but up to 17 000-fold faster with reduced computational requirements. RápidoPGS is implemented in R and can work with user-supplied summary statistics or download them from the GWAS catalog.

Availability and implementation: Our method is available with a GPL license as an R package from CRAN and GitHub.

Supplementary information: Supplementary data are available at Bioinformatics online.

RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding