Bayesian disclosure risk assessment: predicting small frequencies in contingency tables
Bayesian disclosure risk assessment: predicting small frequencies in contingency tables
We propose an approach for assessing the risk of individual identification in the release of categorical data. This requires the accurate calculation of predictive probabilities for those cells in a contingency table which have small sample frequencies, making the problem somewhat different from usual contingency table estimation, where interest is generally focussed on regions of high probability. Our approach is Bayesian and provides posterior predictive probabilities of identification risk. By incorporating model uncertainty into our analysis, we can provide more realistic estimates of disclosure risk for individual cell counts than are provided by methods which ignore the multivariate structure of the data set.
University of Southampton
Forster, Jonathan J.
e3c534ad-fa69-42f5-b67b-11617bc84879
Webb, Emily L.
4b686dfe-80d6-4074-ba1b-2adacd253b8b
28 February 2007
Forster, Jonathan J.
e3c534ad-fa69-42f5-b67b-11617bc84879
Webb, Emily L.
4b686dfe-80d6-4074-ba1b-2adacd253b8b
Forster, Jonathan J. and Webb, Emily L.
(2007)
Bayesian disclosure risk assessment: predicting small frequencies in contingency tables
(S3RI Methodology Working Papers, M07/05)
Southampton, GB.
University of Southampton
21pp.
Record type:
Monograph
(Working Paper)
Abstract
We propose an approach for assessing the risk of individual identification in the release of categorical data. This requires the accurate calculation of predictive probabilities for those cells in a contingency table which have small sample frequencies, making the problem somewhat different from usual contingency table estimation, where interest is generally focussed on regions of high probability. Our approach is Bayesian and provides posterior predictive probabilities of identification risk. By incorporating model uncertainty into our analysis, we can provide more realistic estimates of disclosure risk for individual cell counts than are provided by methods which ignore the multivariate structure of the data set.
Text
44611-01.pdf
- Author's Original
More information
Published date: 28 February 2007
Identifiers
Local EPrints ID: 44611
URI: http://eprints.soton.ac.uk/id/eprint/44611
PURE UUID: 077c4831-08ab-4748-9177-bf0179ea804c
Catalogue record
Date deposited: 28 Feb 2007
Last modified: 16 Mar 2024 02:45
Export record
Contributors
Author:
Jonathan J. Forster
Author:
Emily L. Webb
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics