The 47-kDa lipoprotein is an abundant integral membrane protein and dominant immunogen of Treponema pallidum subsp. pallidum. Previous DNA sequencing of the 47-kDa-lipoprotein gene did not reveal consensus features representative of other bacterial lipoprotein genes; this prompted further analyses with emphasis on elucidation of the N terminus of the molecule. To assist in localizing start signals for the protein, the transcription initiation site for the 47-kDa-antigen gene was determined. RNA isolated from both T. pallidum and recombinant Escherichia coli expressing the 47-kDa antigen was used as a template in reverse transcriptase primer extension. Upon analysis of cDNA products, transcription initiation was localized to one nucleotide in T. pallidum and to two adjacent nucleotides in E. coli. When various primers were used in DNA sequencing reactions for these analyses, a previously undetected nucleotide (G) was found in the purported 5' untranslated region; this altered the upstream reading frame to reveal plausible sites for ribosome binding (GGAGG), translation initiation (GTG start codon), and signal peptidase II processing (Val-Val-Gly-Cys). Discounting acylation, the molecular weight of the mature polypeptide is 45,756 (approximately 46,600 with acylation). To derive nonacylated 47-kDa antigen for further structure-function studies, the 47-kDa-antigen gene was subcloned without acylation signals as a genetic construct encoding a glutathione S-transferase fusion protein; following cleavage with thrombin, the nonacylated 47-kDa protein was hydrophilic rather than amphiphilic but retained its antigenicity when tested against 116 human serum samples from patients with various stages of syphilis.