Rank intraclass correlation for clustered data

Stat Med. 2023 Oct 30;42(24):4333-4348. doi: 10.1002/sim.9864. Epub 2023 Aug 7.

Abstract

Clustered data are common in biomedical research. Observations in the same cluster are often more similar to each other than to observations from other clusters. The intraclass correlation coefficient (ICC), first introduced by R. A. Fisher, is frequently used to measure this degree of similarity. However, the ICC is sensitive to extreme values and skewed distributions, and depends on the scale of the data. It is also not applicable to ordered categorical data. We define the rank ICC as a natural extension of Fisher's ICC to the rank scale, and describe its corresponding population parameter. The rank ICC is simply interpreted as the rank correlation between a random pair of observations from the same cluster. We also extend the definition when the underlying distribution has more than two hierarchies. We describe estimation and inference procedures, show the asymptotic properties of our estimator, conduct simulations to evaluate its performance, and illustrate our method in three real data examples with skewed data, count data, and three-level ordered categorical data.

Keywords: clustered data; intraclass correlation; rank association measures.

MeSH terms

  • Biomedical Research / statistics & numerical data
  • Cluster Analysis
  • Computer Simulation*
  • Data Interpretation, Statistical
  • Humans
  • Models, Statistical*