Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes

Nat Genet. 2023 Jul;55(7):1088-1090. doi: 10.1038/s41588-023-01438-3. Epub 2023 Jun 29.

Abstract

The release of 150,119 UK Biobank sequences represents an unprecedented opportunity as a reference panel to impute low-coverage whole-genome sequencing data with high accuracy but current methods cannot cope with the size of the data. Here we introduce GLIMPSE2, a low-coverage whole-genome sequencing imputation method that scales sublinearly in both the number of samples and markers, achieving efficient whole-genome imputation from the UK Biobank reference panel while retaining high accuracy for ancient and modern genomes, particularly at rare variants and for very low-coverage samples.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Specimen Banks*
  • Gene Frequency
  • Genome
  • Genotype
  • Polymorphism, Single Nucleotide* / genetics
  • United Kingdom