The fine-scale grading of the severity experienced by animals used in research constitutes a key element of the 3Rs (replace, reduce, and refine) principles and a legal requirement in the European Union Directive 2010/63/EU. Particularly, the exact assessment of all signs of pain, suffering, and distress experienced by laboratory animals represents a prerequisite to develop refinement strategies. However, minimal and noninvasive methods for an evidence-based severity assessment are scarce. Therefore, we investigated whether voluntary wheel running (VWR) provides an observer-independent behaviour-centred approach to grade severity experienced by C57BL/6J mice undergoing various treatments. In a mouse model of chemically induced acute colitis, VWR behaviour was directly related to colitis severity, whereas clinical scoring did not sensitively reflect severity but rather indicated marginal signs of compromised welfare. Unsupervised k-means algorithm-based cluster analysis of body weight and VWR data enabled the discrimination of cluster borders and distinct levels of severity. The validity of the cluster analysis was affirmed in a mouse model of acute restraint stress. This method was also applicable to uncover and grade the impact of serial blood sampling on the animal's welfare, underlined by increased histological scores in the colitis model. To reflect the entirety of severity in a multidimensional model, the presented approach may have to be calibrated and validated in other animal models requiring the integration of further parameters. In this experimental set up, however, the automated assessment of an emotional/motivational driven behaviour and subsequent integration of the data into a mathematical model enabled unbiased individual severity grading in laboratory mice, thereby providing an essential contribution to the 3Rs principles.