Imputing unobserved values with the EM algorithm under left
and right-truncation, and interval censoring for estimating the size of hidden populations
Imputing unobserved values with the EM algorithm under left
and right-truncation, and interval censoring for estimating the size of hidden populations
Capture–recapture techniques have been used for considerable time to predict population size. Estimators usually rely on frequency counts for numbers of trappings; however, it may be the case that these are not available for a particular problem, for example if the original data set has been lost and only a summary table is available. Here, we investigate techniques for specific examples; the motivating example is an epidemiology study by Mosley et al., which focussed on a cholera outbreak in East Pakistan. To demonstrate the wider range of the technique, we also look at a study for predicting the long-term outlook of the AIDS epidemic using information on number of sexual partners. A new estimator is developed here which uses the EM algorithm to impute unobserved values and then uses these values in a similar way to the existing estimators. The results show that a truncated approach – mimicking the Chao lower bound approach – gives an improved estimate when population homogeneity is violated
75-87
Robb, Matthew L.
4923310c-1c2d-4f37-9fe1-f7e30eb9504a
Böhning, Dankmar
1df635d4-e3dc-44d0-b61d-5fd11f6434e1
February 2011
Robb, Matthew L.
4923310c-1c2d-4f37-9fe1-f7e30eb9504a
Böhning, Dankmar
1df635d4-e3dc-44d0-b61d-5fd11f6434e1
Robb, Matthew L. and Böhning, Dankmar
(2011)
Imputing unobserved values with the EM algorithm under left
and right-truncation, and interval censoring for estimating the size of hidden populations.
Biometrical Journal, 53 (1), .
(doi:10.1002/bimj.201000004).
Abstract
Capture–recapture techniques have been used for considerable time to predict population size. Estimators usually rely on frequency counts for numbers of trappings; however, it may be the case that these are not available for a particular problem, for example if the original data set has been lost and only a summary table is available. Here, we investigate techniques for specific examples; the motivating example is an epidemiology study by Mosley et al., which focussed on a cholera outbreak in East Pakistan. To demonstrate the wider range of the technique, we also look at a study for predicting the long-term outlook of the AIDS epidemic using information on number of sexual partners. A new estimator is developed here which uses the EM algorithm to impute unobserved values and then uses these values in a similar way to the existing estimators. The results show that a truncated approach – mimicking the Chao lower bound approach – gives an improved estimate when population homogeneity is violated
This record has no associated files available for download.
More information
e-pub ahead of print date: 24 January 2011
Published date: February 2011
Organisations:
Statistics, Statistical Sciences Research Institute
Identifiers
Local EPrints ID: 210471
URI: http://eprints.soton.ac.uk/id/eprint/210471
ISSN: 0323-3847
PURE UUID: b972e440-ecb7-4dfa-be22-012c0847b6ed
Catalogue record
Date deposited: 09 Feb 2012 11:45
Last modified: 15 Mar 2024 03:39
Export record
Altmetrics
Contributors
Author:
Matthew L. Robb
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics