DNA mixtures containing semen and vaginal fluid are common biological samples in forensic analysis. However, the analysis of semen-vaginal fluid mixtures remains challenging. In this study, to solve these problems, it is proposed to combine semen-specific CpG sites and closely related microhaplotype sites to form a new composite genetic marker (semen-specific methylation-microhaplotype). Six methylation-microhaplotype loci were selected. To further improve discrimination power, five methylation-SNP loci were also included. The methylation levels and genotypes of these selected loci were obtained using massively parallel sequencing technology. Except for loci MMH04ZHA019 and MMH17ZHA059, the remaining nine loci were successfully sequenced. For the successfully sequenced loci, they performed well in identifying individuals and body fluids. An allele categorization model was developed using K-nearest neighbour algorithm, which was then used to predict allele types in semen-vaginal fluid mixtures. These loci were able to confirm the presence of semen and link semen to a true donor in semen-vaginal fluid mixtures with mixing ratios of 10:1, 9:1, 5:1, 4:1, 1:1, 1:3, 1:4, 1:8 and 1:9 (semen:vaginal fluid). This preliminary study suggests that this new composite genetic marker has great potential as a supplementary tool to commonly used genetic markers (STR, etc.) for analysing semen-vaginal fluid mixtures.
Keywords: DNA methylation; K-nearest neighbour algorithm; forensic genetics; massively parallel sequencing; microhaplotype; semen–vaginal fluid mixtures.
© 2025 The Author(s).