cDNA clones encoding a human blood group Rh polypeptide were isolated from a human bone marrow cDNA library by using a polymerase chain reaction-amplified DNA fragment encoding the known common N-terminal region of the Rh proteins. The entire primary structure of the Rh polypeptide has been deduced from the nucleotide sequence of a 1384-base-pair-long cDNA clone. Translation of the open reading frame indicates that the Rh protein is composed of 417 amino acids, including the initiator methionine, which is removed in the mature protein, lacks a cleavable N-terminal sequence, and has no consensus site for potential N-glycosylation. The predicted molecular mass of the protein is 45,500, while that estimated for the Rh protein analyzed in NaDodSO4/polyacrylamide gels is in the range of 30,000-32,000. These findings suggest either that the hydrophobic Rh protein behaves abnormally on NaDodSO4 gels or that the Rh mRNA may encode a precursor protein, which is further matured by a proteolytic cleavage of the C-terminal region of the polypeptide. Hydropathy analysis and secondary structure predictions suggest the presence of 13 membrane-spanning domains, indicating that the Rh polypeptide is highly hydrophobic and deeply buried within the phospholipid bilayer. In RNA blot-hybridization (Northern) analysis, the Rh cDNA probe detects a major 1.7-kilobase and a minor 3.5-kilobase mRNA species in adult erythroblasts, fetal liver, and erythroid (K562, HEL) and megakaryocytic (MEG01) leukemic cell lines, but not in adult liver and kidney tissues or lymphoid (Jurkat) and promyelocytic (HL60) cell lines. These results suggest that the expression of the Rh gene(s) might be restricted to tissues or cell lines expressing erythroid characters.