Sudan, a northeastern African country, is characterized by high levels of cultural, linguistic, and genetic diversity, which is believed to be affected by continuous migration from neighboring countries. Consistent with such demographic effect, genome-wide SNP data revealed a shared ancestral component among Sudanese Afro-Asiatic speaking groups and non-African populations, mainly from West Asia. Although this component is shared among all Afro-Asiatic speaking groups, the extent of this sharing in Semitic groups, such as Sudanese Arab, is still unknown. Using genotypes of six polymorphic human leukocyte antigen (HLA) genes (i.e., HLA-A, -C, -B, -DRB1, -DQB1, and -DPB1), we examined the genetic structure of eight East African ethnic groups with origins in Sudan, South Sudan, and Ethiopia. We identified informative HLA alleles using principal component analysis, which revealed that the two Semitic groups (Gaalien and Shokrya) constituted a distinct cluster from the other Afro-Asiatic speaking groups in this study. The HLA alleles that distinguished Semitic Arabs co-exist in the same extended HLA haplotype, and those alleles are in strong linkage disequilibrium. Interestingly, we find the four-locus haplotype "C*12:02-B*52:01-DRB1*15:02-DQB1*06:01" exclusively in non-African populations and it is widely spread across Asia. The identification of this haplotype suggests a gene flow from Asia, and likely these haplotypes were brought to Africa through back migration from the Near East. These findings will be of interest to biomedical and anthropological studies that examine the demographic history of northeast Africa.
© 2021. The Author(s), under exclusive licence to European Society of Human Genetics.