Overexpression of the epidermal growth factor receptor (egfr) gene is a common feature in breast cancer. We demonstrated recently that the expression of EGFR in breast cancer strongly correlates with the length of a CA simple sequence repeat within the first 2000 bases in intron 1 of the egfr gene [CA simple sequence repeat (CA-SSR) I; H. Buerger et al., Cancer Res., 60: 854-857, 2000]. Using a standardized semiautomated method of microsatellite analysis for loss of heterozygosity detection, we identified an allelic imbalance (AI) at the egfr locus in 55 of 163 primary breast cancer cases. Fine mapping of the chromosomal region at 7p12-15 around the egfr gene using 10 CA-SSR markers showed that mutations of egfr in breast cancer are frequently restricted to the first intron of egfr. Thereby, the simple sequence repeat CA-SSR I in intron 1 was affected in 84% of the patients with AI. Reverse transcription-PCR analysis of 23 breast cancer tissues with AI excluded the presence of in-frame deletions between exon 2 and exon 7. For additional characterization of the underlying phenomenon leading to the detection of an AI in microsatellite analysis, a quantitative 5'-nuclease assay for the first CA-SSR I in intron 1 was established. In breast cancer cases with AI the presence of amplifications of this sequence was shown. Kaplan-Meier analysis revealed a statistically significant worse prognosis for patients with AI in the cancer tissue at the egfr locus compared with patients without AI. Interestingly, 75% of the patients bearing AI of CA-SSR I in the tumor also showed AI at normal, nontumorous breast tissue. Our data strongly support the assumption that distinct amplifications in intronic sequences of the egfr gene, which enhance the basic transcription activity of the gene, represent one of the first steps in breast carcinogenesis. Furthermore, they point to the presence of prognosis-associated markers for breast cancer already in morphological normal breast tissue.