An overview of population size estimation where linking registers results in incomplete covariates, with an application to mode of transport of serious road casualties
An overview of population size estimation where linking registers results in incomplete covariates, with an application to mode of transport of serious road casualties
We consider the linkage of two or more registers in the situation where the registers do not cover the whole target population, and relevant categorical auxiliary variables (unique to one of the registers; although different variables could be present on each register) are available in addition to the usual matching variable(s). The linked registers therefore do not contain full information on either the observations (often individuals) or the variables. By treating this as a missing data problem it is possible to construct a linked data set, adjusted to estimate the part of the population missed by both registers, and containing completed covariate information for all the registers. This is achieved using an Expectation-Maximization (EM)-algorithm. We elucidate the properties of this approach where the model is appropriate and in situations corresponding with real applications in official statistics, and also where the model conditions are violated. The approach is applied to data on road accidents in the Netherlands, where the cause of the accident is denoted by the police and by the hospital. Here the cause of the accident denoted by the police is considered as missing information for the statistical units only registered by the hospital, and the other way around. The method needs to be widely applied to give a better impression of the range of problems where it can be beneficial.
239-263
Van Der Heijden, Peter
85157917-3b33-4683-81be-713f987fd612
Smith, Paul
a2548525-4f99-4baf-a4d0-2b216cce059c
Cruyff, Maarten
68bcfa19-3d85-4b0f-a6a4-6e148b265f19
Bakker, Bart
75cc130a-157a-4b06-a5ea-92a6457d806f
1 March 2018
Van Der Heijden, Peter
85157917-3b33-4683-81be-713f987fd612
Smith, Paul
a2548525-4f99-4baf-a4d0-2b216cce059c
Cruyff, Maarten
68bcfa19-3d85-4b0f-a6a4-6e148b265f19
Bakker, Bart
75cc130a-157a-4b06-a5ea-92a6457d806f
Van Der Heijden, Peter, Smith, Paul, Cruyff, Maarten and Bakker, Bart
(2018)
An overview of population size estimation where linking registers results in incomplete covariates, with an application to mode of transport of serious road casualties.
Journal of Official Statistics, 34 (1), .
(doi:10.1515/jos-2018-0011).
Abstract
We consider the linkage of two or more registers in the situation where the registers do not cover the whole target population, and relevant categorical auxiliary variables (unique to one of the registers; although different variables could be present on each register) are available in addition to the usual matching variable(s). The linked registers therefore do not contain full information on either the observations (often individuals) or the variables. By treating this as a missing data problem it is possible to construct a linked data set, adjusted to estimate the part of the population missed by both registers, and containing completed covariate information for all the registers. This is achieved using an Expectation-Maximization (EM)-algorithm. We elucidate the properties of this approach where the model is appropriate and in situations corresponding with real applications in official statistics, and also where the model conditions are violated. The approach is applied to data on road accidents in the Netherlands, where the cause of the accident is denoted by the police and by the hospital. Here the cause of the accident denoted by the police is considered as missing information for the statistical units only registered by the hospital, and the other way around. The method needs to be widely applied to give a better impression of the range of problems where it can be beneficial.
Text
traffic 20170721 JOS Final Clean - Manuscript
- Author's Original
Restricted to Repository staff only
Request a copy
Text
Van der Heijden et al JOS accepted 2017
- Accepted Manuscript
Text
An Overview of Population Size Estimation where Linking Registers Results in Incomplete Covariates, with an Application to Mode of Transport of Serious Road Casualties
- Version of Record
More information
Accepted/In Press date: 1 September 2017
e-pub ahead of print date: 1 March 2018
Published date: 1 March 2018
Identifiers
Local EPrints ID: 414761
URI: http://eprints.soton.ac.uk/id/eprint/414761
ISSN: 0282-423X
PURE UUID: 29c1528e-5bfe-4fdc-9960-8550f79a3c6c
Catalogue record
Date deposited: 10 Oct 2017 16:31
Last modified: 16 Apr 2024 04:01
Export record
Altmetrics
Contributors
Author:
Maarten Cruyff
Author:
Bart Bakker
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics