A cDNA clone for the extrinsic 30 kDa protein (OEC30) of photosystem II in Euglena gracilis Z was isolated and characterized. The open reading frame of the cDNA encoded a polypeptide of 338 amino acids, which consisted of a long presequence of 93 amino acids and a mature polypeptide of 245 amino acids. Two hydrophobic domains were identified in the presequence, in contrast to the presence of a single hydrophobic domain in the presequence of the corresponding proteins from higher plants. At the N- and C-terminal regions, respectively, of the presequence, a signal-peptide-like sequence and a thylakoid-transfer domain were identified. The presence of a long and unique presequence in the precursor to OEC30 is probably related to the complexity of the intracellular processes required for the synthesis and/or transport of the protein in Euglena.