Using Family History Data to Improve the Power of Association Studies: Application to Cancer in UK Biobank

Genet Epidemiol. 2025 Jan;49(1):e22609. doi: 10.1002/gepi.22609.

Abstract

In large cohort studies the number of unaffected individuals outnumbers the number of affected individuals, and the power can be low to detect associations for outcomes with low prevalence. We consider how including recorded family history in regression models increases the power to detect associations between genetic variants and disease risk. We show theoretically and using Monte-Carlo simulations that including a family history of the disease, with a weighting of 0.5 compared with true cases, increases the power to detect associations. This is a powerful approach for detecting variants with moderate effects, but for larger effect sizes a weighting of > 0.5 can be more powerful. We illustrate this both for common variants and for exome sequencing data for over 400,000 individuals in UK Biobank to evaluate the association between the burden of protein-truncating variants in genes and risk for four cancer types.

Keywords: UK Biobank; cancer; exome sequencing; family history; power calculations; protein‐truncating variants.

MeSH terms

  • Biological Specimen Banks*
  • Exome Sequencing
  • Genetic Predisposition to Disease*
  • Genetic Variation
  • Humans
  • Monte Carlo Method
  • Neoplasms* / epidemiology
  • Neoplasms* / genetics
  • UK Biobank
  • United Kingdom / epidemiology