On capture-recapture with validation information
On capture-recapture with validation information
This work shows how capture-recapture modelling can be performed in the presence of a validation set, a sample that includes all the counts, in particular, zero counts which are not observed in typical capture-recapture settings. We start with the simplehomogeneous case for estimation of the Binomial and the Poisson distribution using the EM algorithm. A flexible non-parametric mixture model approach allowing for heterogeneity of the data by means of a nested EM algorithm using validation information wasused to allow for more components in the target population. The estimate for the total population size can be obtained by jointly fitting a zero-truncated distribution to the truncated data and an untruncated distribution of the same class to the untruncateddata by means of the EM algorithm. Simulation studies demonstrated the value of including validation information into the modelling to estimate the total size of the population. This was also done following a ratio regression approach which is explained indetail along this work.
For illustration of the major ideas of these applications, these methods were applied to public health problem scenarios related with Salmonella infection in poultry, Bowel Cancer and transmittable diseases: Brucellosis and Syphilis. A community study on the number of Heroin users in Bangkok was also considered. The main goal of the present study is to adjust the undercount of disease/drug use occurrence in the UK farms/peopleduring a period of study. Three models were considered for the last approach which seemed relevant for the data situation. However, situations of zero-inflated counts were also debated in the case the first ratio is particularly lower than the other ratiosindicating potential presence of zero-inflation. This work also introduces simulation studies which help to understand the role of the validation sample in the estimation process showing that we can rely more confidently on the estimate for the populationsize using that additional information.
University of Southampton
Azevedo, Carla
9bca9abe-03bb-4459-a1cd-eca57ce425d0
May 2019
Azevedo, Carla
9bca9abe-03bb-4459-a1cd-eca57ce425d0
Bohning, Dankmar
1df635d4-e3dc-44d0-b61d-5fd11f6434e1
Azevedo, Carla
(2019)
On capture-recapture with validation information.
University of Southampton, Doctoral Thesis, 157pp.
Record type:
Thesis
(Doctoral)
Abstract
This work shows how capture-recapture modelling can be performed in the presence of a validation set, a sample that includes all the counts, in particular, zero counts which are not observed in typical capture-recapture settings. We start with the simplehomogeneous case for estimation of the Binomial and the Poisson distribution using the EM algorithm. A flexible non-parametric mixture model approach allowing for heterogeneity of the data by means of a nested EM algorithm using validation information wasused to allow for more components in the target population. The estimate for the total population size can be obtained by jointly fitting a zero-truncated distribution to the truncated data and an untruncated distribution of the same class to the untruncateddata by means of the EM algorithm. Simulation studies demonstrated the value of including validation information into the modelling to estimate the total size of the population. This was also done following a ratio regression approach which is explained indetail along this work.
For illustration of the major ideas of these applications, these methods were applied to public health problem scenarios related with Salmonella infection in poultry, Bowel Cancer and transmittable diseases: Brucellosis and Syphilis. A community study on the number of Heroin users in Bangkok was also considered. The main goal of the present study is to adjust the undercount of disease/drug use occurrence in the UK farms/peopleduring a period of study. Three models were considered for the last approach which seemed relevant for the data situation. However, situations of zero-inflated counts were also debated in the case the first ratio is particularly lower than the other ratiosindicating potential presence of zero-inflation. This work also introduces simulation studies which help to understand the role of the validation sample in the estimation process showing that we can rely more confidently on the estimate for the populationsize using that additional information.
Text
Carla Azevedo thesis
- Version of Record
More information
Published date: May 2019
Identifiers
Local EPrints ID: 437709
URI: http://eprints.soton.ac.uk/id/eprint/437709
PURE UUID: efa336ee-ef4d-4360-8ba8-d45e2925248b
Catalogue record
Date deposited: 12 Feb 2020 17:32
Last modified: 17 Mar 2024 03:25
Export record
Contributors
Author:
Carla Azevedo
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics