Probasin (PB) occurs both as a secreted and a nuclear protein that is abundantly expressed in the epithelial cells of the rat prostate. A genomic clone of 17.5 kb gene was isolated from a rat liver genomic library, determining that the probasin gene was comprised of seven exons where the splice donor/acceptor sites conformed to the GT/AG consensus sequence. The exon number and size are remarkably similar to those of aphrodisin, rat alpha(2)-urinary globulin and major urinary protein, outlier members of the lipocalin superfamily. In addition, alignment of the deduced amino acids determined that the probasin gene also contains the glycine-X-tryptophan (G-X-W) motif similar to that of human retinol serum binding protein which binds retinol, and the C-X-X-X-C motif also found in insect lipocalins that bind pheromones. The cysteine residues in exons 3 and 6 are conserved, predicting a secondary structure of eight beta-sheets and the alpha-helix commonly seen in the lipocalin superfamily. Unique PB characteristics include a large genomic fragment (17.5 kb compared to the 3-5 kb seen in other lipocalin genes) and an isoelectric point (pI) of 11.5 which is very basic compared to that of the other more acidic lipocalins. Functionally, PB gene expression is regulated by androgens and zinc in the epithelial cells of the rodent prostate. The 5'-flanking region of probasin contains two androgen receptor binding sites that allow androgen-specific gene expression as well as prostate-specific elements that target and maintain high levels of transgene expression in several PB transgenic mouse models.