We tested the hypothesis that the original surgeon-investigator classification of a fracture of the distal radius in a prospective cohort study would have moderate agreement with the final classification by the team performing final analysis of the data. The initial post-injury radiographs of 621 patients with distal radius fractures from a multicenter international prospective cohort study were classified according to the Comprehensive Classification of Fractures, first by the treating surgeon-investigator and then by a research team analyzing the data. Correspondence between original and revised classification was evaluated using the Kappa statistic at the type, group and subgroup levels. The agreement between initial and revised classifications decreased from Type (moderate; Κ(type) = 0.60), to Group (moderate; Κ(group) = 0.41), to Subgroup (fair; Κ(subgroup) = 0.33) classifications (all p < 0.05). There was only moderate agreement in the classification of fractures of the distal radius between surgeon-investigators and final evaluators in a prospective multicenter cohort study. Such variations might influence interpretation and comparability of the data. The lack of a reference standard for classification complicates efforts to lessen variability and improve consensus.