We conducted studies to determine the magnitude and sources of variability in androgen assay results and to identify laboratories capable of performing such assays for large epidemiological studies. We studied androstanediol (ADIOL), androstanediol glucuronide (ADIOL G), androstenedione (ADION), androsterone glucuronide (ANDRO G), androsterone sulfate (ANDRO S), dehydroepiandrosterone (DHEA), dehydroepiandrosterone sulfate (DHEA S), dihydrotestosterone (DHT), and testosterone (TESTO). A single sample of plasma was obtained from five postmenopausal women, five premenopausal women in the midfollicular phase of the menstrual cycle, and five women in the midluteal phase, divided into aliquots, and stored at -70 degrees. Four sets of two coded aliquots from each woman were then sent to participating labs for analysis at monthly intervals over 4 months. Using the logarithm of assay measurements, we estimated the components of variance and three measures of reproducibility. The usual coefficient of variation is a function of the components that are under the control of the laboratory. The intraclass correlation between measurements for a given individual is the proportion of the total variability that is associated with individuals. The minimum detectable relative difference is important to evaluate study feasibility. Results suggest that a single sample of ADIOL G, DHEA, DHEA S, and ANDRO G (with two lab replicates per sample) can be used to discriminate reliably among women in a given menstrual phase or menopausal status. The results for DHT, TESTO, ADION, and ANDRO S are more problematic and suggest that the present measurement techniques should be used with care, especially with midluteal phase women. The results for ADIOL suggest that this assay is not yet ready for use in epidemiological studies.