A radiograph is considered of high quality when it allows a radiologist to identify abnormalities with high sensitivity and specificity. Although many methods for assessing image quality have been devised, it is not clear which is most meaningful or how well these methods correlate with one another. A pilot study was undertaken to compare five methods of evaluating mammographic image quality. Each of the methods was used to form separate rankings of 11 mammographic system configurations. In two of the methods, observers (three radiologists and three physicists) subjectively ranked the "image quality" of radiographs of phantoms obtained with each configuration. The third method ranked the systems according to contrast as measured densitometrically with an aluminum step wedge, and the fourth, in terms of lowest to highest mean glandular radiation doses to the breast. In the final method, observers based their rankings on mammograms of patients. The intra- and interobserver variabilities of each ranking method were assessed, as well as the correlations between methods, by using standard nonparametric statistical tests. Intraobserver consistency was high with any of the image quality ranking methods; however, image quality rankings based on either of the two phantoms provided better agreement among observers than did rankings based on images of patients. Surprisingly, no significant degree of correlation was found between any two image quality evaluation methods. Our work may have two implications for the American College of Radiology Mammography Accreditation Program: (1) small variations in phantom scores do not necessarily correlate with subjective variation in image quality in radiographs of patients, and (2) when small numbers of radiographs are used, the assessment of the quality of mammograms of patients may vary considerably among radiologists.