The University of Southampton
University of Southampton Institutional Repository

A bagging-based correction for the mixture model estimator

A bagging-based correction for the mixture model estimator
A bagging-based correction for the mixture model estimator
Estimation of a population size by means of capture-recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero-truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so-called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias
0323-3847
993-1005
Kuhnert, Ronny
8518a8ac-54e5-4117-b66b-66a82fdece7c
Del Rio Vilas, Victor Javier
c439650f-6c9d-42c1-80a2-2fa570de525f
Gallagher, James
4279d3ea-61c8-4fdb-8834-ad81757cd7f5
Böhning, Dankmar
1df635d4-e3dc-44d0-b61d-5fd11f6434e1
Kuhnert, Ronny
8518a8ac-54e5-4117-b66b-66a82fdece7c
Del Rio Vilas, Victor Javier
c439650f-6c9d-42c1-80a2-2fa570de525f
Gallagher, James
4279d3ea-61c8-4fdb-8834-ad81757cd7f5
Böhning, Dankmar
1df635d4-e3dc-44d0-b61d-5fd11f6434e1

Kuhnert, Ronny, Del Rio Vilas, Victor Javier, Gallagher, James and Böhning, Dankmar (2008) A bagging-based correction for the mixture model estimator. Biometrical Journal, 50 (6), 993-1005. (doi:10.1002/bimj.200810485). (PMID:19089886)

Record type: Article

Abstract

Estimation of a population size by means of capture-recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero-truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so-called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias

This record has no associated files available for download.

More information

Published date: December 2008
Organisations: Statistics, Statistical Sciences Research Institute

Identifiers

Local EPrints ID: 210487
URI: http://eprints.soton.ac.uk/id/eprint/210487
ISSN: 0323-3847
PURE UUID: cb909107-f4df-4fde-886f-c58285d1db3b
ORCID for Dankmar Böhning: ORCID iD orcid.org/0000-0003-0638-7106

Catalogue record

Date deposited: 09 Feb 2012 13:55
Last modified: 15 Mar 2024 03:39

Export record

Altmetrics

Contributors

Author: Ronny Kuhnert
Author: Victor Javier Del Rio Vilas
Author: James Gallagher

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×