Although four Shigella species (S. flexneri, S. sonnei, S. dysenteriae, and S. boydii) have been reported, S. sp. PAMC 28760, an Antarctica isolate, is the only one with a complete genome deposited in NCBI database as an uncharacterized isolate. Because it is the world's driest, windiest, and coldest continent, Antarctica provides an unfavourable environment for microorganisms. Computational analysis of genomic sequences of four Shigella species and our uncategorized Antarctica isolates Shigella sp. PAMC28760 was performed using MP3 (offline version) program to predict trehalase encoding genes as a pathogenic or non-pathogenic form. Additionally, we employed RAST and Prokka (offline version) annotation programs to determine locations of periplasmic (treA) and cytoplasmic (treF) trehalase genes in studied genomes. Our results showed that only 56 out of 134 Shigella strains had two different trehalase genes (treF and treA). It was revealed that the treF gene tends to be prevalent in Shigella species. In addition, both treA and treF genes were present in our strain S. sp. PAMC28760. The main objective of this study was to predict the prevalence of two different trehalase genes (treF and treA) in the complete genome of Shigella sp. PAMC28760 and other complete genomes of Shigella species. Till date, it is the first study to show that two types of trehalase genes are involved in Shigella species, which could offer insight on how the bacteria use accessible carbohydrate like glucose produced from the trehalose degradation pathway, and importance of periplasmic trehalase involvement in bacterial virulence.
Keywords: HMM; MP3; SVM; Shigella sp.; prokka; trehalase.