Viral RNA was extracted from plasma samples collected from five individuals during the period of viremia before seroconversion in primary infection with human immunodeficiency virus type 1 (HIV-1) and amplified by polymerase chain reaction. Nucleotide sequence analysis of amplified DNA from the V3 and V4 hypervariable regions indicated that the initial virus population of each acutely infected individual was completely homogeneous in sequence. No intrasample variability was found among the 44,090 nucleotides sequenced in this region of env, contrasting with the high degree of variability normally found in seropositive individuals. Paradoxically, substantial sequence variability was found in the normally high conserved gag gene (encoding p17) in most of the preseroconversion samples. The diversity of p17 sequences in samples that were homogeneous in V3 and V4 can most readily be explained by the existence of strong selection for specific env sequences either upon transmission or in the interval between exposure and seroconversion in the exposed individual. Evidence that localizes the selected region upon transmission to V3 is provided by the similarity or identity of V3 loop sequences in five individuals with epidemiologically unrelated HIV-1 infections, while regions flanking the V3 loop and the V4 hypervariable region were highly divergent. The actual V3 sequences were similar to those associated with macrophage tropism in primary isolates of HIV, irrespective of whether infection was acquired by sexual contact or parenterally through transfusion of contaminated factor VIII. Proviral DNA sequences in peripheral blood mononuclear cells remained homogeneous in the V3 and V4 regions (and variable in p17gag) for several months after seroconversion. The persistence of HIV sequences in peripheral blood mononuclear cells identical to those found at primary infection in the absence of continued virus expression provides an explanation for the previously observed differences in the composition of circulating DNA and RNA populations in sequential samples from seropositive individuals.