Protection of micro-data subject to edit constraints against statistical disclosure
Protection of micro-data subject to edit constraints against statistical disclosure
Before releasing statistical outputs, data suppliers have to assess if the privacy of the statistical units is endangered and apply Statistical Disclosure Control (SDC) methods if necessary. SDC methods perturb, modify or summarize the data, depending on the format for releasing the data, whether as micro-data or tabular data. The goal is to choose an optimal method that manages disclosure risk below a tolerable risk threshold while ensuring high utility and high quality statistical data. In this article we first overview several SDC methods for continuous and categorical micro-data. All the methods perturb the data in some way. Changing values, however, will cause fully edited records in micro-data to fail edit constraints (i.e., logical rules or edits), resulting in low utility data. Moreover, an inconsistent record will signal it as having been perturbed for disclosure control and attempts can be made to unmask the data. In order to deal with these problems, we develop new implementation methods for the perturbation and minimize record level edit failures as well as overall measures which assess information loss and utility. This is done by perturbing within control strata and imputing for failed edits, ensuring additivity constraints, and preserving totals, means and covariance matrices.
additive noise, information loss, micro-aggregation, post-randomization method, rank swapping, rounding, statistical disclosure control
Southampton Statistical Sciences Research Institute, University of Southampton
Shlomo, Natalie
e749febc-b7b9-4017-be48-96d59dd03215
de Waal, Ton
7d2c05de-fece-476c-bbf7-685f3c4b5221
12 October 2006
Shlomo, Natalie
e749febc-b7b9-4017-be48-96d59dd03215
de Waal, Ton
7d2c05de-fece-476c-bbf7-685f3c4b5221
Shlomo, Natalie and de Waal, Ton
(2006)
Protection of micro-data subject to edit constraints against statistical disclosure
(S3RI Methodology Working Papers, M06/16)
Southampton, UK.
Southampton Statistical Sciences Research Institute, University of Southampton
36pp.
Record type:
Monograph
(Working Paper)
Abstract
Before releasing statistical outputs, data suppliers have to assess if the privacy of the statistical units is endangered and apply Statistical Disclosure Control (SDC) methods if necessary. SDC methods perturb, modify or summarize the data, depending on the format for releasing the data, whether as micro-data or tabular data. The goal is to choose an optimal method that manages disclosure risk below a tolerable risk threshold while ensuring high utility and high quality statistical data. In this article we first overview several SDC methods for continuous and categorical micro-data. All the methods perturb the data in some way. Changing values, however, will cause fully edited records in micro-data to fail edit constraints (i.e., logical rules or edits), resulting in low utility data. Moreover, an inconsistent record will signal it as having been perturbed for disclosure control and attempts can be made to unmask the data. In order to deal with these problems, we develop new implementation methods for the perturbation and minimize record level edit failures as well as overall measures which assess information loss and utility. This is done by perturbing within control strata and imputing for failed edits, ensuring additivity constraints, and preserving totals, means and covariance matrices.
Text
41859-01.pdf
- Author's Original
More information
Published date: 12 October 2006
Keywords:
additive noise, information loss, micro-aggregation, post-randomization method, rank swapping, rounding, statistical disclosure control
Identifiers
Local EPrints ID: 41859
URI: http://eprints.soton.ac.uk/id/eprint/41859
PURE UUID: 7a4eddfb-846b-45f4-8ff8-6bd1ab4cbcd0
Catalogue record
Date deposited: 12 Oct 2006
Last modified: 20 Feb 2024 03:21
Export record
Contributors
Author:
Natalie Shlomo
Author:
Ton de Waal
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics