Calibrated imputation of numerical data under linear edit restrictions
Calibrated imputation of numerical data under linear edit restrictions
A common problem faced by statistical offices is that data may be missing from collected data sets. The typical way to overcome this problem is to impute the missing data. The problem of imputing missing data is complicated by the fact that statistical data often have to satisfy certain edit rules and that values of variables sometimes have to sum up to known totals. Standard imputation methods for numerical data as described in the literature generally do not take such edit rules and totals into account. In the paper we describe algorithms for imputation of missing numerical data that do take edit restrictions into account and that ensure that sums are calibrated to known totals. The methods sequentially impute the missing data, i.e. the variables with missing values are imputed one by one. To assess the performance of the imputation methods a simulation study is carried out as well as an evaluation study based on a real dataset.
Southampton Statistical Sciences Research Institute, University of Southampton
Pannekoek, Jeroen
5225a4ab-7074-4ef8-82cd-b0797688df9c
Shlomo, Natalie
e749febc-b7b9-4017-be48-96d59dd03215
De Waal, Ton
7d2c05de-fece-476c-bbf7-685f3c4b5221
23 October 2009
Pannekoek, Jeroen
5225a4ab-7074-4ef8-82cd-b0797688df9c
Shlomo, Natalie
e749febc-b7b9-4017-be48-96d59dd03215
De Waal, Ton
7d2c05de-fece-476c-bbf7-685f3c4b5221
Pannekoek, Jeroen, Shlomo, Natalie and De Waal, Ton
(2009)
Calibrated imputation of numerical data under linear edit restrictions
(S3RI Methodology Working Papers, M09/17)
Southampton, UK.
Southampton Statistical Sciences Research Institute, University of Southampton
25pp.
Record type:
Monograph
(Working Paper)
Abstract
A common problem faced by statistical offices is that data may be missing from collected data sets. The typical way to overcome this problem is to impute the missing data. The problem of imputing missing data is complicated by the fact that statistical data often have to satisfy certain edit rules and that values of variables sometimes have to sum up to known totals. Standard imputation methods for numerical data as described in the literature generally do not take such edit rules and totals into account. In the paper we describe algorithms for imputation of missing numerical data that do take edit restrictions into account and that ensure that sums are calibrated to known totals. The methods sequentially impute the missing data, i.e. the variables with missing values are imputed one by one. To assess the performance of the imputation methods a simulation study is carried out as well as an evaluation study based on a real dataset.
Text
s3ri-workingpaper-M09-17.pdf
- Other
More information
Published date: 23 October 2009
Identifiers
Local EPrints ID: 69194
URI: http://eprints.soton.ac.uk/id/eprint/69194
PURE UUID: 8c483d23-2ca4-473c-b021-5db6c074d235
Catalogue record
Date deposited: 23 Oct 2009
Last modified: 20 Feb 2024 03:17
Export record
Contributors
Author:
Jeroen Pannekoek
Author:
Natalie Shlomo
Author:
Ton De Waal
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics