The nociceptive flexion reflex (NFR) has been used as a psychophysiological tool to study spinal nociceptive processes in numerous clinical and experimental studies. Despite widespread use of the NFR, few attempts have been made to empirically test and compare different scoring criteria to detect the presence/absence of the reflex. The present studies were conducted to address this issue. Study 1 (N=56 healthy participants) examined the reliability of 15 different scoring criteria that were examined in a previous report. Study 2 (N=73 healthy participants) extended this work by examining normalized scoring criteria based on biceps femoris activity unrelated to noxious stimulation (reference contraction, maximal contraction). In both studies, receiver operating characteristics (ROCs) analyses were used to evaluate and compare different scoring methods. The results indicate that a number of different criteria were acceptable for defining an NFR threshold based on the area under the ROC curve and its statistical significance; however, NFR Interval z score [(NFR Interval Mean-baseline mean)/baseline SD] emerged as the scoring criterion with the greatest accuracy and with cut-points that are reliable across samples. These findings support the application of a common NFR scoring criterion to enhance direct comparison of results across different research laboratories and study samples.