An overview of autosomal STRs and identity SNPs in a Norwegian population using massively parallel sequencing

Forensic Sci Int Genet. 2024 Jul:71:103057. doi: 10.1016/j.fsigen.2024.103057. Epub 2024 May 3.

Abstract

In recent years, probabilistic genotyping software has been adapted for the analysis of massively parallel sequencing (MPS) forensic data. Likelihood ratios (LR) are based on allele frequencies selected from populations of interest. This study provides an outline of sequence-based (SB) allele frequencies for autosomal short tandem repeats (aSTRs) and identity single nucleotide polymorphisms (iSNPs) in 371 individuals from Southern Norway. 27 aSTRs and 94 iSNPs were previously analysed with the ForenSeq™ DNA Signature Prep Kit (Verogen). The number of alleles with frequencies less than 0.05 for sequenced-based alleles was 4.6 times higher than for length-based alleles. Consistent with previous studies, it was observed that sequence-based data (both with and without flanks) exhibited higher allele diversity compared to length-based (LB) data; random match probabilities were lower for SB alleles confirming their advantage to discriminate between individuals. Two alleles in markers D22S1045 and Penta D were observed with SNPs in the 3´ flanking region, which have not been reported before. Also, a novel SNP with a minor allele frequency (MAF) of 0.001, was found in marker TH01. The impact of the sample size on minor allele frequency (MAF) values was studied in 88 iSNPs from Southern Norway (n = 371). The findings were then compared to a larger Norwegian population dataset (n = 15,769). The results showed that the smaller Southern Norway dataset provided similar results, and it was a representative sample. Population structure was analyzed for regions within Southern Norway; FST estimates for aSTR and iSNPs did not indicate any genetic structure. Finally, we investigated the genetic differences between Southern Norway and two other populations: Northern Norway and Denmark. Allele frequencies between these populations were compared, and we found no significant frequency differences (p-values > 0.0001). We also calculated the pairwise FST values per marker and comparisons between Southern and Northern Norway showed small differences. In contrast, the comparisons between Southern Norway and Denmark showed higher FST values for some markers, possibly driven by distinct alleles that were present in only one of the populations. In summary, we propose that allele frequencies from each population considered in this study could be used interchangeably to calculate genotype probabilities.

Keywords: Allele frequencies; Autosomal STRs; Identity SNPs; Massively parallel sequencing; Norwegian population; Population genetics.

MeSH terms

  • DNA Fingerprinting*
  • Gene Frequency*
  • Genetics, Population*
  • Genotype
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Likelihood Functions
  • Microsatellite Repeats*
  • Norway
  • Polymorphism, Single Nucleotide*
  • Sequence Analysis, DNA