A flexible ratio regression approach for zero-truncated capture-recapture counts

Biometrics. 2016 Sep;72(3):697-706. doi: 10.1111/biom.12485. Epub 2016 Feb 10.

Abstract

Capture-recapture methods are used to estimate the size of a population of interest which is only partially observed. In such studies, each member of the population carries a count of the number of times it has been identified during the observational period. In real-life applications, only positive counts are recorded, and we get a truncated at zero-observed distribution. We need to use the truncated count distribution to estimate the number of unobserved units. We consider ratios of neighboring count probabilities, estimated by ratios of observed frequencies, regardless of whether we have a zero-truncated or an untruncated distribution. Rocchetti et al. (2011) have shown that, for densities in the Katz family, these ratios can be modeled by a regression approach, and Rocchetti et al. (2014) have specialized the approach to the beta-binomial distribution. Once the regression model has been estimated, the unobserved frequency of zero counts can be simply derived. The guiding principle is that it is often easier to find an appropriate regression model than a proper model for the count distribution. However, a full analysis of the connection between the regression model and the associated count distribution has been missing. In this manuscript, we fill the gap and show that the regression model approach leads, under general conditions, to a valid count distribution; we also consider a wider class of regression models, based on fractional polynomials. The proposed approach is illustrated by analyzing various empirical applications, and by means of a simulation study.

Keywords: Capture-recapture; Mixed binomial distributions; Ratio regression estimator; Zero-truncated model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Interpretation, Statistical
  • Humans
  • Ill-Housed Persons / statistics & numerical data
  • Intestinal Neoplasms / diagnosis
  • Models, Biological*
  • Models, Statistical*
  • Population Density*
  • Regression Analysis*
  • Statistical Distributions