Recent studies have demonstrated that per-beam planar intensity-modulated radiation therapy (IMRT) quality assurance (QA) passing rates may not predict clinically relevant patient dose errors. This work is to evaluate the effect of dose variations introduced in dynamic multi-leaf collimator (DMLC) modeling and delivery processes on clinically relevant metrics for IMRT. Ten head and neck (HN) IMRT plans were randomly selected for this study. The conventional per-beam IMRT QA was performed for each plan by 2 different methods: (1) with gantry angle of 0 (gantry pointing downward) for all IMRT fields and (2) with gantry at specific angles as designed in the IMRT plan. For each patient, a batch analysis was done for each scenario and then imported to the 3DVH (Sun Nuclear Corp.) for processing. A "corrected DVH" was generated and compared to the DVH from the treatment plan. Their differences represented errors introduced from the combination of the treatment planning system (TPS) dose calculation algorithm and beam-delivery. The dose metrics from the two scenarios were compared with the corresponding calculated doses, and then their differences were analyzed. Although all per-beam planar IMRT QA had high Gamma passing rates 99.3 ± 1.3% (92.3-100%) for "2%/3 mm" criteria, there were significant errors in some of the calculated clinical dose metrics. Such as, for all the plans studied, there were as much as 3.2%, 5.7%, 5.6%, 2.3%, 4.1%, and 23.8% errors found in max cord dose, max brainstem dose, mean parotid dose, larynx dose, oral cavity dose, and PTV(D95) dose, respectively. The differences in errors for clinical metrics obtained between the two scenarios (zero gantry angle vs. true gantry angles) can also be significant: max cord dose (2.9% vs. 0.2%), max brainstem dose (3.8% vs. 0.4%), mean parotid dose (2.3% vs. 4.5%), mean larynx dose (3.9% vs. 2.0%), mean oral cavity dose (1.6% vs. 3.9%), and PTV(D95) dose (-0.4% vs. -2.6%). However, in the two scenarios, a strong and clear correlation between the dose differences for each of the organ structures was observed. This study confirms that conventional IMRT QA performance metrics are not predictive of dose errors in PTV and organs-at-risk. The clinically-relevant-dose QA has allowed us to predict the patient dose-volume relationships.