Poor inter-rater reliability is a major concern, contributing to error variance, which decreases power and increases the risk for failed trials. This is particularly problematic with the Hamilton Depression Scale (HAMD), due to lack of standardized questions or explicit scoring procedures. Establishing standardized procedures for administering and scoring the HAMD is typically done at study initiation meetings. However, the format and time allotted is usually insufficient, and evaluation of the trainee's ability to actually conduct a clinical interview is limited. To address this problem, we developed a web-based, interactive rater education program for standardized training to diverse sites in multi-center trials. The program includes both didactic training on scoring conventions and live, remote observation of trainees applied skills. The program was pilot tested with nine raters from a single site. Results found a significant increase in didactic knowledge pre-to-post testing, with the mean number of incorrect answers decreasing from 6.5 (S.D.=1.64) to 1.3 (S.D.=1.03), t(5)=7.35, P=0.001 (20 item exam). Seventy-five percent of the trainees' interviews were within two points of the trainer's score. Inter-rater reliability (intraclass correlation) (based on trainees actual interviews) was 0.97, P<0.0001. Results support the feasibility of this methodology for improving rater training. An NIMH funded study is currently underway examining this methodology in a multi-site trial.