We have isolated and sequenced four overlapping cDNA clones from a normal adult human colon library, which together gave the entire nucleotide sequence for biliary glycoprotein I (BGP I). BGP I is a member of the carcinoembryonic antigen (CEA) gene family, which is a subfamily in the immunoglobulin gene superfamily. The deduced amino acid sequence of the combined clones for BGP I revealed a 34-residue leader sequence followed by a 108-residue N-terminal domain, a 178-residue immunoglobulin-like domain, a 108-residue region specific to BGP I, a 24-residue transmembrane domain, and a 35-residue cytoplasmic domain. The nucleotide sequence of BGP I exhibited greater than 80% identity with CEA and nonspecific crossreacting antigen (NCA) in the leader peptide, N-terminal domain, and immunoglobulin-like domain. The BGP I-specific domain, designated A', was 56.7% and 55.8% identical at the nucleotide level and 42.6% and 39.6% identical at the amino acid level to the immunoglobulin-like domain of NCA and the first immunoglobulin-like domain of CEA, respectively. Beyond nucleotide position 1375 the 3' region of the BGP I cDNA was found to be specific for BGP I. Hybridization of a probe from this region to electrophoretic blots of RNAs from different human tissues showed a predominant 2.8-kilobase (kb) message accompanied by weaker bands 4.1 and 2.1 kb in size. The same probe gave a single band in Southern blot analysis of restricted total human DNA. Using a coding region probe from the BGP I domain A', we observed 4.1- and 2.1-kb messages. Lack of the 2.8-kb band suggested that different forms of BGP I may be generated by posttranscriptional modification of the same gene. We propose that BGP I diverged from NCA by acquiring an immunoglobulin-like domain substantially different from the domains found in NCA or CEA and also a new cytoplasmic domain. The latter feature should result in a substantially different membrane anchorage mechanism of BGP I compared to CEA, which lacks the cytoplasmic domain and is anchored via a phosphatidylinositol-glycan structure. Protein structural analysis of BGP I isolated from human bile revealed a blocked N terminus, 129 amino acids of internal sequence that are in agreement with the translated cDNA sequence, and five glycosylation sites in the peptides sequenced.