Methods for analysis of skewed data distributions in psychiatric clinical studies: working with many zero values

Am J Psychiatry. 2004 Jul;161(7):1159-68. doi: 10.1176/appi.ajp.161.7.1159.

Abstract

Objective: Psychiatric clinical studies, including those in drug abuse research, often provide data that are challenging to analyze and use for hypothesis testing because they are heavily skewed and marked by an abundance of zero values. The authors consider methods of analyzing data with those characteristics.

Method: The possible meaning of zero values and the statistical methods that are appropriate for analyzing data with many zero values in both cross-sectional and longitudinal designs are reviewed. The authors illustrate the application of these alternative methods using sample data collected with the Addiction Severity Index.

Results: Data that include many zeros, if the zero value is considered the lowest value on a scale that measures severity, may be analyzed with several methods other than standard parametric tests. If zero values are considered an indication of a case without a problem, for which a measure of severity is not meaningful, analyses should include separate statistical models for the zero values and for the nonzero values. Tests linking the separate models are available.

Conclusions: Standard methods, such as t tests and analyses of variance, may be poor choices for data that have unique features. The use of proper statistical methods leads to more meaningful study results and conclusions.

Publication types

  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Cross-Sectional Studies
  • Data Interpretation, Statistical
  • Humans
  • Longitudinal Studies
  • Models, Statistical
  • Psychiatric Status Rating Scales / statistics & numerical data
  • Psychiatry / methods
  • Psychiatry / statistics & numerical data*
  • Research Design
  • Severity of Illness Index
  • Statistics as Topic*
  • Statistics, Nonparametric
  • Substance-Related Disorders / diagnosis