A 2648-bp fragment from the P4 plasmid of Shigella sonnei strain 47 coding for the SsoII restriction endonuclease (ENase) and methyltransferase (MTase) (recognition sequence 5'-CCNGG) was sequenced. Two divergently arranged open reading frames of 905 bp for the SsoII ENase (R.SsoII) and 1137 bp for the MTase (M.SsoII) were identified. The coding regions are separated by 110 bp. The calculated M(r) of R.SsoII (35937) and M.SsoII (42887) are in good agreement with values previously obtained by in vitro transcription-translation experiments, i.e., 35 and 43 kDa for the ENase and MTase, respectively. The M.SsoII amino acid (aa) sequence revealed a considerable similarity to m5C-MTases recognizing the related sequences--M.EcoRII, M.dcm, M.MspI, M.BsuFI, M.HpaII, and M.HhaI. Surprisingly, the greatest degree of homology has been observed between the aa sequences of M.SsoII and M.NlaX, with an unidentified recognition sequence. The multiple alignment of aa sequences helps to identify the blocks of conserved aa in variable regions of MTases. These conserved aa can play a key role in target recognition. Some aspects of evolution of m5C-MTases are discussed.