In alcohol studies, drinking outcomes such as number of days of any alcohol drinking (DAD) over a period of time do not precisely capture the differences among subjects in a study population of interest. For example, the value of 0 on DAD could mean that the subject was continually abstinent from drinking such as lifetime abstainers or the subject was alcoholic, but happened not to use any alcohol during the period of interest. In statistics, zeros of the first kind are called structural zeros, to distinguish them from the sampling zeros of the second type. As the example indicates, the structural and sampling zeros represent two groups of subjects with quite different psychosocial outcomes. In the literature on alcohol use, although many recent studies have begun to explicitly account for the differences between the two types of zeros in modeling drinking variables as a response, none has acknowledged the implications of the different types of zeros when such modeling drinking variables are used as a predictor. This paper serves as the first attempt to tackle the latter issue and illustrate the importance of disentangling the structural and sampling zeros by using simulated as well as real study data.
Keywords: NHANES; number of days of drinking alcohol; structural zero; zero-inflated count data; zero-inflated models for count data.