The University of Southampton
University of Southampton Institutional Repository

On capture-recapture with validation information

On capture-recapture with validation information
On capture-recapture with validation information
This work shows how capture-recapture modelling can be performed in the presence of a validation set, a sample that includes all the counts, in particular, zero counts which are not observed in typical capture-recapture settings. We start with the simplehomogeneous case for estimation of the Binomial and the Poisson distribution using the EM algorithm. A flexible non-parametric mixture model approach allowing for heterogeneity of the data by means of a nested EM algorithm using validation information wasused to allow for more components in the target population. The estimate for the total population size can be obtained by jointly fitting a zero-truncated distribution to the truncated data and an untruncated distribution of the same class to the untruncateddata by means of the EM algorithm. Simulation studies demonstrated the value of including validation information into the modelling to estimate the total size of the population. This was also done following a ratio regression approach which is explained indetail along this work.

For illustration of the major ideas of these applications, these methods were applied to public health problem scenarios related with Salmonella infection in poultry, Bowel Cancer and transmittable diseases: Brucellosis and Syphilis. A community study on the number of Heroin users in Bangkok was also considered. The main goal of the present study is to adjust the undercount of disease/drug use occurrence in the UK farms/peopleduring a period of study. Three models were considered for the last approach which seemed relevant for the data situation. However, situations of zero-inflated counts were also debated in the case the first ratio is particularly lower than the other ratiosindicating potential presence of zero-inflation. This work also introduces simulation studies which help to understand the role of the validation sample in the estimation process showing that we can rely more confidently on the estimate for the populationsize using that additional information.
University of Southampton
Azevedo, Carla
9bca9abe-03bb-4459-a1cd-eca57ce425d0
Azevedo, Carla
9bca9abe-03bb-4459-a1cd-eca57ce425d0
Bohning, Dankmar
1df635d4-e3dc-44d0-b61d-5fd11f6434e1

Azevedo, Carla (2019) On capture-recapture with validation information. University of Southampton, Doctoral Thesis, 157pp.

Record type: Thesis (Doctoral)

Abstract

This work shows how capture-recapture modelling can be performed in the presence of a validation set, a sample that includes all the counts, in particular, zero counts which are not observed in typical capture-recapture settings. We start with the simplehomogeneous case for estimation of the Binomial and the Poisson distribution using the EM algorithm. A flexible non-parametric mixture model approach allowing for heterogeneity of the data by means of a nested EM algorithm using validation information wasused to allow for more components in the target population. The estimate for the total population size can be obtained by jointly fitting a zero-truncated distribution to the truncated data and an untruncated distribution of the same class to the untruncateddata by means of the EM algorithm. Simulation studies demonstrated the value of including validation information into the modelling to estimate the total size of the population. This was also done following a ratio regression approach which is explained indetail along this work.

For illustration of the major ideas of these applications, these methods were applied to public health problem scenarios related with Salmonella infection in poultry, Bowel Cancer and transmittable diseases: Brucellosis and Syphilis. A community study on the number of Heroin users in Bangkok was also considered. The main goal of the present study is to adjust the undercount of disease/drug use occurrence in the UK farms/peopleduring a period of study. Three models were considered for the last approach which seemed relevant for the data situation. However, situations of zero-inflated counts were also debated in the case the first ratio is particularly lower than the other ratiosindicating potential presence of zero-inflation. This work also introduces simulation studies which help to understand the role of the validation sample in the estimation process showing that we can rely more confidently on the estimate for the populationsize using that additional information.

Text
Carla Azevedo thesis - Version of Record
Available under License University of Southampton Thesis Licence.
Download (768kB)

More information

Published date: May 2019

Identifiers

Local EPrints ID: 437709
URI: http://eprints.soton.ac.uk/id/eprint/437709
PURE UUID: efa336ee-ef4d-4360-8ba8-d45e2925248b
ORCID for Dankmar Bohning: ORCID iD orcid.org/0000-0003-0638-7106

Catalogue record

Date deposited: 12 Feb 2020 17:32
Last modified: 17 Mar 2024 03:25

Export record

Contributors

Author: Carla Azevedo
Thesis advisor: Dankmar Bohning ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×