A multicenter pilot evaluation of the National Institutes of Health chronic graft-versus-host disease (cGVHD) therapeutic response measures: feasibility, interrater reliability, and minimum detectable change

Sandra A Mitchell; David Jacobsohn; Kimberly E Thormann Powers; Paul A Carpenter; Mary E D Flowers; Edward W Cowen; Mark Schubert; Maria L Turner; Stephanie J Lee; Paul Martin; Michael R Bishop; Kristin Baird; Javier Bolaños-Meade; Kevin Boyd; Jane M Fall-Dickson; Lynn H Gerber; Jean-Pierre Guadagnini; Matin Imanguli; Michael C Krumlauf; Leslie Lawley; Li Li; Bryce B Reeve; Janine Austin Clayton; Georgia B Vogelsang; Steven Z Pavletic

doi:10.1016/j.bbmt.2011.04.002

A multicenter pilot evaluation of the National Institutes of Health chronic graft-versus-host disease (cGVHD) therapeutic response measures: feasibility, interrater reliability, and minimum detectable change

Biol Blood Marrow Transplant. 2011 Nov;17(11):1619-29. doi: 10.1016/j.bbmt.2011.04.002. Epub 2011 Apr 12.

Affiliation

¹ Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA. [email protected]

Abstract

The lack of standardized criteria for measuring therapeutic response is a major obstacle to the development of new therapeutic agents for chronic graft-versus-host disease (cGVHD). National Institutes of Health (NIH) consensus criteria for evaluating therapeutic response were published in 2006. We report the results of 4 consecutive pilot trials evaluating the feasibility and estimating the interrater reliability and minimum detectable change of these response criteria. Hematology-oncology clinicians with limited experience in applying the NIH cGVHD response criteria (n = 34) participated in a 2.5-hour training session on response evaluation in cGVHD. Feasibility and interrater reliability between subspecialty cGVHD experts and this panel of clinician raters were examined in a sample of 25 children and adults with cGVHD. The minimum detectable change was calculated using the standard error of measurement. Clinicians' impressions of the brief training session, the photo atlas, and the response criteria documentation tools were generally favorable. Performing and documenting the full set of response evaluations required a median of 21 minutes (range: 12-60 minutes) per rater. The Schirmer tear test required the greatest time of any single test (median: 9 minutes). Overall, interrater agreement for skin and oral manifestations was modest; however, in the third and fourth trials, the agreement between clinicians and experts for all dimensions except movable sclerosis approached satisfactory values. In the final 2 trials, the threshold for defining change exceeding measurement error was 19% to 22% body surface area (BSA) for erythema, 18% to 26% BSA for movable sclerosis, 17% to 21% BSA for nonmovable sclerosis, and 2.1 to 2.6 points on the 15-point NIH Oral cGHVD scale. Agreement between clinician-expert pairs was moderate to substantial for the measures of functional capacity and for the gastrointestinal and global cGVHD rating scales. These results suggest that the NIH response criteria are feasible for use, and these reliability estimates are encouraging, because they were observed following a single 2.5-hour training session given at multiple transplant centers, with no opportunity for iterative training and calibration. Research is needed to evaluate inter- and intrarater reliability in larger samples, and to evaluate these response criteria as predictors of outcomes in clinical trials.

Published by Elsevier Inc.

Publication types

Multicenter Study
Research Support, N.I.H., Extramural
Research Support, N.I.H., Intramural

MeSH terms

Adolescent
Adult
Aged
Child
Child, Preschool
Chronic Disease
Female
Graft vs Host Disease / diagnosis*
Graft vs Host Disease / therapy*
Hematology / education
Humans
Leukemia / surgery
Lymphoma / surgery
Male
Middle Aged
Multiple Myeloma / surgery
National Institutes of Health (U.S.)
Pilot Projects
Prospective Studies
Stem Cell Transplantation / adverse effects
United States
Young Adult

A multicenter pilot evaluation of the National Institutes of Health chronic graft-versus-host disease (cGVHD) therapeutic response measures: feasibility, interrater reliability, and minimum detectable change

Authors

Affiliation

Abstract

Publication types

MeSH terms

Grants and funding