Highly accurate Korean draft genomes reveal structural variation highlighting human telomere evolution

Nucleic Acids Res. 2025 Jan 7;53(1):gkae1294. doi: 10.1093/nar/gkae1294.

Abstract

Given the presence of highly repetitive genomic regions such as subtelomeric regions, understanding human genomic evolution remains challenging. Recently, long-read sequencing technology has facilitated the identification of complex genetic variants, including structural variants (SVs), at the single-nucleotide level. Here, we resolved SVs and their underlying DNA damage-repair mechanisms in subtelomeric regions, which are among the most uncharted genomic regions. We generated ∼20 × high-fidelity long-read sequencing data from three Korean individuals and their partially phased high-quality de novo genome assemblies (contig N50: 6.3-58.2 Mb). We identified 131 138 deletion and 121 461 insertion SVs, 41.6% of which were prevalent in the East Asian population. The commonality of the SVs identified among the Korean population was examined by short-read sequencing data from 103 Korean individuals, providing the first comprehensive SV set representing the population based on the long-read assemblies. Manual investigation of 19 large subtelomeric SVs (≥5 kb) and their associated repair signatures revealed the potential repair mechanisms leading to the formation of these SVs. Our study provides mechanistic insight into human telomere evolution and can facilitate our understanding of human SV formation.

MeSH terms

  • Asian People / genetics
  • DNA Repair / genetics
  • Evolution, Molecular*
  • Genome, Human* / genetics
  • Genomic Structural Variation
  • Genomics / methods
  • Humans
  • Republic of Korea
  • Telomere* / genetics