Nonparametric maximum likelihood estimation of population size based on the counting distribution
The paper discusses the estimation of an unknown population size n. Suppose that an identification mechanism can identify nobs cases. The Horvitz–Thompson estimator of n adjusts this number by the inverse of 1−p0, where the latter is the probability of not identifying a case. When repeated counts of identifying the same case are available, we can use the counting distribution for estimating p0 to solve the problem. Frequently, the Poisson distribution is used and, more recently, mixtures of Poisson distributions. Maximum likelihood estimation is discussed by means of the EM algorithm. For truncated Poisson mixtures, a nested EM algorithm is suggested and illustrated for several application cases. The algorithmic principles are used to show an inequality, stating that the Horvitz–Thompson estimator of n by using the mixed Poisson model is always at least as large as the estimator by using a homogeneous Poisson model. In turn, if the homogeneous Poisson model is misspecified it will, potentially strongly, underestimate the true population size. Examples from various areas illustrate this finding.
Keywords: Capture–recapture; Completeness-of-disease registration; Counting distribution model; Missing species problem; Nonparametric maximum likelihood estimator; Poisson mixture model; Problem of ‘sleepers’; Zero truncation
Document Type: Research Article
Publication date: 2005-08-01