Goodness-of-fit statistics for age-specific reference intervals

Stat Med. 2000 Nov 15;19(21):2943-62. doi: 10.1002/1097-0258(20001115)19:21<2943::aid-sim559>3.0.co;2-5.

Abstract

The age-specific reference interval is a commonly used screening tool in medicine. It involves estimation of extreme quantile curves (such as the 5th and 95th centiles) of a reference distribution of clinically normal individuals. It is crucial that models used to estimate such intervals fit the data extremely well. However, few procedures to assess goodness-of-fit have been proposed in the literature, and even fewer have been evaluated systematically. Here we consider procedures based on the distribution of the Z-scores (standardized residuals) from a model and on Pearson chi(2) statistics for observed and expected counts in groups defined by age and the estimated reference centile curves. Two of the procedures (Q and grid tests) are mainly inferential, whereas the third (permutation bands and B-tests) is essentially graphical. We obtain approximations to the null distributions of several relevant test statistics and examine their size and power for a range of models based on real data sets. We recommend Q-tests in all situations where Z-scores are available since they are general, simple to calculate and usually have the highest power among the three classes of test considered. For the cases considered the grid tests are always inferior to the Q- and B- tests.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Age Factors
  • Fetus / anatomy & histology
  • Gestational Age
  • Humans
  • Humerus / anatomy & histology
  • Likelihood Functions
  • Linear Models
  • Models, Statistical*
  • Regression Analysis
  • Statistical Distributions