Objective: To investigate whether the recently developed (statistically derived) "ASsessment in Ankylosing Spondylitis Working Group" improvement criteria (ASAS-IC) for ankylosing spondylitis (AS) reflect clinically relevant improvement according to the opinion of an expert panel.
Methods: The ASAS-IC consist of four domains: physical function, spinal pain, patient global assessment, and inflammation. Scores on these four domains of 55 patients with AS, who had participated in a non-steroidal anti-inflammatory drug efficacy trial, were presented to an international expert panel (consisting of patients with AS and members of the ASAS Working Group) in a three round Delphi exercise. The number of (non-)responders according to the ASAS-IC was compared with the final consensus of the experts. The most important domains in the opinion of the experts were identified, and also selected with discriminant analysis. A number of provisional criteria sets that best represented the consensus of the experts were defined. Using other datasets, these clinically derived criteria sets as well as the statistically derived ASAS-IC were then tested for discriminative properties and for agreement with the end of trial efficacy by patient and doctor.
Results: Forty experts completed the three Delphi rounds. The experts considered twice as many patients to be responders than the ASAS-IC (42 v 21). Overall agreement between experts and ASAS-IC was 62%. Spinal pain was considered the most important domain by most experts and was also selected as such by discriminant analysis. Provisional criteria sets with an agreement of >or=80% compared with the consensus of the experts showed high placebo response rates (27-42%), in contrast with the ASAS-IC with a predefined placebo response rate of 25%. All criteria sets and the ASAS-IC discriminated well between active and placebo treatment (chi(2)=36-45; p<0.001). Compared with the end of trial efficacy assessment, the provisional criteria sets showed an agreement of 71-82%, sensitivity of 67-83%, and specificity of 81-88%. The ASAS-IC showed an agreement of 70%, sensitivity of 62%, and specificity of 89%.
Conclusion: The ASAS-IC are strict in defining response, are highly specific, and consequently show lower sensitivity than the clinically derived criteria sets. However, those patients who are considered as responders by applying the ASAS-IC are acknowledged as such by the expert panel as well as by patients' and doctors' judgments, and are therefore likely to be true responders.