Fitting models to incomplete categorical data requires more care than fitting models to the complete data counterparts, not only in the setting of missing data that are non-randomly missing, but even in the familiar missing at random setting. Various aspects of this point of view have been considered in the literature. We review it using data from a multi-centre trial on the relief of psychiatric symptoms. First, it is shown how the usual expected information matrix (referred to as naive information) is biased even under a missing at random mechanism. Second, issues that arise under non-random missingness assumptions are illustrated. It is argued that at least some of these problems can be avoided using contextual information.