Different methods to complete datasets used for capture-recapture estimation. Estimating the number of usual residents in the Netherlands
Different methods to complete datasets used for capture-recapture estimation. Estimating the number of usual residents in the Netherlands
We are interested in an estimate of the usual residents in the Netherlands. Capture-recapture estimation with three registers enables us to estimate the size of the total population, of which the usual residents are a part. However, usual residence cannot be used as a covariate because it is not available in one of the registers. We approach this as a missing data problem. There are different methods available to handle missing data. In this manuscript we use Expectation Maximization (EM) algorithm and Predictive Mean Matching (PMM). The EM algorithm is often used in categorical data analysis, but PMM has the advantage of flexibility in the choice for a specific part of the observed data used for the imputation of the missing data. Four scenarios have been identified where the missing data are completed via either the EM algorithm or PMM imputation, resulting in different population size estimates for usual residence. It was found that the different scenarios lead to different population size estimates. Even small changes in the completed data lead to different population size estimates. In this study PMM imputation performs best according flexibility and it is theoretically better motivated.
613-627
Germitse, S.
e8b15330-2222-4e6a-8ea2-af94b76b645d
Bakker, B.F.M.
dd17ff6b-e10a-42a0-8592-beafb65640d7
van der Heijden, P.
85157917-3b33-4683-81be-713f987fd612
1 January 2015
Germitse, S.
e8b15330-2222-4e6a-8ea2-af94b76b645d
Bakker, B.F.M.
dd17ff6b-e10a-42a0-8592-beafb65640d7
van der Heijden, P.
85157917-3b33-4683-81be-713f987fd612
Germitse, S., Bakker, B.F.M. and van der Heijden, P.
(2015)
Different methods to complete datasets used for capture-recapture estimation. Estimating the number of usual residents in the Netherlands.
Statistical Journal of the International Association for Official Statistics, 31 (4), .
(doi:10.3233/SJI-150938).
Abstract
We are interested in an estimate of the usual residents in the Netherlands. Capture-recapture estimation with three registers enables us to estimate the size of the total population, of which the usual residents are a part. However, usual residence cannot be used as a covariate because it is not available in one of the registers. We approach this as a missing data problem. There are different methods available to handle missing data. In this manuscript we use Expectation Maximization (EM) algorithm and Predictive Mean Matching (PMM). The EM algorithm is often used in categorical data analysis, but PMM has the advantage of flexibility in the choice for a specific part of the observed data used for the imputation of the missing data. Four scenarios have been identified where the missing data are completed via either the EM algorithm or PMM imputation, resulting in different population size estimates for usual residence. It was found that the different scenarios lead to different population size estimates. Even small changes in the completed data lead to different population size estimates. In this study PMM imputation performs best according flexibility and it is theoretically better motivated.
Text
sji_2015_31-4_sji-31-4-sji938_sji-31-sji938
- Version of Record
More information
Submitted date: 2015
Published date: 1 January 2015
Organisations:
Social Statistics & Demography
Identifiers
Local EPrints ID: 381205
URI: http://eprints.soton.ac.uk/id/eprint/381205
ISSN: 1874-7655
PURE UUID: 5ecb2b6f-43ca-46e1-bf89-aec3bc205385
Catalogue record
Date deposited: 28 Sep 2015 13:17
Last modified: 15 Mar 2024 03:46
Export record
Altmetrics
Contributors
Author:
S. Germitse
Author:
B.F.M. Bakker
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics