Importance: Cutaneous chronic graft-vs-host disease (cGVHD) is common after allogeneic hematopoietic stem cell transplant and is often associated with poor patient outcomes. A reliable and practical method for assessing disease severity and response to therapy among these patients is urgently needed.
Objective: To evaluate the interrater agreement and reliability of skin-specific and range of motion (ROM) variables of the 2014 National Institutes of Health (NIH) response criteria for cGVHD and a skin sclerosis grading scale (SSG).
Design, setting, and participants: In this observational study performed at a single tertiary academic center, 6 academic blood and marrow transplant specialists and 4 medical dermatologists examined 8 patients with diagnosed cutaneous cGVHD on July 10, 2015. The patient cohort was enriched for patients with sclerotic features. Each patient was evaluated by using the skin-specific and ROM criteria of the 2014 NIH response criteria for cGVHD and an SSG ranging from 0 to 3. Each patient was also asked to complete quality-of-life scoring instruments. Interrater agreement and reliability were estimated by calculating the Krippendorff α and Cohen κ statistics. Data were analyzed from September 29, 2015, through November 22, 2018.
Main outcomes and measures: Estimation of interrater agreement by interclass coefficient (Krippendorff α and Cohen κ statistics) for the skin-specific and ROM components of the 2014 NIH Response Criteria for Chronic GVHD and for the SSG.
Results: The median age of the patients evaluated was 54 years (range, 46-58 years). Patients were predominantly male (6 [75%]). Six of the 8 patients had a predominantly sclerotic cutaneous phenotype. Interrater agreement among our experts was acceptable for NIH skin feature score (0.68; 95% CI, 0.30-0.86) and good for NIH ROM scoring (0.80; 95% CI, 0.68-0.86). Dermatologists had acceptable agreement for NIH skin GVHD score (0.69; 95% CI, 0.25-0.82) and skin feature score (0.78; 95% CI, 0.17-0.98), good agreement in ROM grading (0.85; 95% CI, 0.69-0.90), and near perfect agreement in identifying sclerosis (0.82; 95% CI, 0.27-0.97).
Conclusions and relevance: Although dermatologists had acceptable agreement in NIH skin GVHD score and skin features score, near perfect agreement in identifying cutaneous sclerosis, better agreement in grading severity of cutaneous cGVHD, especially in the intermediate grades, appears to be needed.