Population-aware permutation-based significance thresholds for genome-wide association studies

Bioinform Adv. 2024 Oct 28;4(1):vbae168. doi: 10.1093/bioadv/vbae168. eCollection 2024.

Abstract

Motivation: Permutation-based significance thresholds have been shown to be a robust alternative to classical Bonferroni significance thresholds in genome-wide association studies (GWAS) for skewed phenotype distributions. The recently published method permGWAS introduced a batch-wise approach to efficiently compute permutation-based GWAS. However, running multiple univariate tests in parallel leads to many repetitive computations and increased computational resources. More importantly, traditional permutation methods that permute only the phenotype break the underlying population structure.

Results: We propose permGWAS2, an improved method that does not break the population structure during permutations and uses an elegant block matrix decomposition to optimize computations, thereby reducing redundancies. We show on synthetic data that this improved approach yields a lower false discovery rate for skewed phenotype distributions compared to the previous version and the commonly used Bonferroni correction. In addition, we re-analyze a dataset covering phenotypic variation in 86 traits in a population of 615 wild sunflowers (Helianthus annuus L.). This led to the identification of dozens of novel associations with putatively adaptive traits, and removed several likely false-positive associations with limited biological support.

Availability and implementation: permGWAS2 is open-source and publicly available on GitHub for download: https://github.com/grimmlab/permGWAS.