Using discharge abstract data, we analysed hospital mortality comparing four different methods of risk adjustment. All patients discharged from the S. Giovanni Battista (Molinette) hospital in Turin (Italy) between January 1996 and June 1999 (n = 169,746) were classified with All Patient Refined--Diagnosis Related Groups (APR-DRG). A first analysis evaluated the time trend of hospital mortality by semester. A second analysis compared hospital mortality during the last 12 months among eight units of internal medicine (n = 5592). All comparisons were made through logistic regression models. As the quality of discharge abstracts increased during time and showed variation among units with similar patients, all comparisons were repeated using four models, characterised by increasing predictivity and sensitivity to quality of data. In addition to crude comparisons (A), the other models included as risk factors: B) age and emergency admission; C) same as 'B' plus expected mortality by APR-DRG; D) same as 'B' plus expected mortality by APR-DRG and risk of death subclass. If no risk factors were considered (A), hospital mortality showed an increasing trend, with an odds ratio (OR) of 1.02 by semester, with a 95% confidence interval (CI) between 1.01 and 1.03. The association was weakened when age and mode of admission were taken into account (B) and disappeared when the APR-DRG expected mortality was also considered (C) (OR = 1.00; CI = 0.98-1.01). Finally, if the comparisons were adjusted also for the expected mortality by APR-DRG and risk of death subclass (D) a reversed trend appeared (OR = 0.95; CI = 0.94-0.97). The comparison among the units of internal medicine gave discordant results according to the method used to adjust for confounders. The most striking variations were detected for those units with the best and the worst clinical data. The unit with the poorer clinical data (average number of diagnoses per patient = 2.9) showed a crude OR of 1.38 (CI = 0.99-1.93) and an adjusted OR (D) of 1.71 (CI = 1.10-2.66); the unit with the best quality of data (average number of diagnoses per patient = 4.4) changed the OR from 1.55 (CI = 1.06-2.26) (A) to 0.66 (CI = 0.37-1.17) (D). In conclusion, these results confirm the high sensitivity of the APR-DRG classification to the quality of data and, more in general, suggest to be prudent when using powerful instruments like this to assess quality of care, especially if the quality of data among the units compared is less than optimal or not homogeneous.