The nucleotide sequence of the entire beta-like globin gene cluster of rabbits has been determined. This sequence of a continuous stretch of 44.5 x 10(3) base-pairs (bp) starts about 6 x 10(3) bp upstream from epsilon (the 5'-most gene) and ends about 12 x 10(3) bp downstream from beta (the 3'-most gene). Analysis of the sequence reveals that: (1) the sequence is relatively A + T rich (about 60%); (2) regions with high G + C content are associated with OcC repeats, a short interspersed repeated DNA in rabbits; (3) the distribution of polypurines, polypyrimidines and alternating purine/pyrimidine tracts is not random within the cluster; (4) most open reading frames are associated with known globin coding regions, OcC repeats or long interspersed repeats (L1 repeats); (5) the most prominent open reading frames are found in the L1 repeats; (6) different strand asymmetries in base composition are associated with embyronic and adult genes as well as the tandem L1 repeats at the 3' end of the cluster; and (7) essentially all the repeats appear to have been inserted by a transposon mechanism. A comparison of the sequence with itself by a dot-plot analysis has revealed nine new members of the OcC family of repeats in addition to the six previously reported. The OcC repeats tend to be clustered, particularly in the epsilon-gamma and gamma-psi delta intergenic regions. Dot-plot comparisons between the rabbit and the human clusters have revealed extensive sequence matches. Homology starts about 6 x 10(3) bp 5' to epsilon or as far upstream as the rabbit sequence is available. It continues throughout the entire cluster and stops about 0.7 x 10(3) bp 3' to beta, at which point several repeats have inserted in both rabbits and humans. Throughout the gene cluster, the homology is interrupted mainly by insertions or deletions in either the rabbit or the human genome. Almost all of the insertions are of known short or long repeated DNAs. The positions of the insertions are different in the two gene clusters, which indicates that both short and long repeats have been transposing throughout the genome for the time since the mammalian radiation. An alignment of rabbit and human sequences allows the calculation of the substitution rate around epsilon. Sequences far removed from the gene are evolving at a rate equivalent to the pseudogene rate, although some short regions show an apparently higher rate.(ABSTRACT TRUNCATED AT 400 WORDS)