Ribotyping Staphylococcus epidermidis Using Probabilistic Sequence Analysis and Levenshtein Distance Algorithm

Curr Microbiol. 2025 Jan 10;82(2):78. doi: 10.1007/s00284-024-04057-1.

Abstract

Staphylococcus epidermidis (S. epidermidis) live in different human locations and natural environments. For ribotyping S. epidermidis sub-species, 2507 PCR-amplified reads of 16S rRNA genes of S. epidermidis in a public dataset were used for probabilistic sequence analysis. A sequence probability logo (sequence pLogo) as a reference sequence of 16S rRNA genes of S. epidermidis was constructed. Through implementation of Levenshtein Distance algorithm, two 20-base pairs (bp) motifs, commonly present in 2507 PCR-amplified reads, were identified. The top 38 S. epidermidis isolates, which carried 16S rRNA nucleotide domains that were made of different sequences but have high similarity scores to two 20-bp motifs, were found from 11 human, 8 animal, 9 plant and 10 environmental samples, indicating that these two 20-bp motifs were broadly present in diverse S. epidermidis isolates. Thirty-one PCR-amplified reads of 16S rRNA genes, which were currently not in the dataset, were utilized to verify the feasibility of using two 20-bp motifs for ribotyping S. epidermidis sub-species. S. epidermidis S1, S3, but not S2, isolates on the human scalp carried a 20-bp sequence domain with high similarities to a 20-bp motif in the sequence pLogo. The phylogenetic tree showed that S. epidermidis S1, S2 and S3 were not from a single common ancestor. Two newly identified 20-bp motifs here, thus, provided reference nucleotide residues for ribotyping S. epidermidis.