Bayesian disclosure risk assessment: predicting small frequencies in contingency tables
Bayesian disclosure risk assessment: predicting small frequencies in contingency tables
We propose an approach for assessing the risk of individual identification in the release of categorical data. This requires the accurate calculation of predictive probabilities for those cells in a contingency table which have small sample frequencies, making the problem somewhat different from usual contingency table estimation, where interest is generally focused on regions of high probability. Our approach is Bayesian and provides posterior predictive probabilities of identification risk. By incorporating model uncertainty in our analysis, we can provide more realistic estimates of disclosure risk for individual cell counts than are provided by methods which ignore the multivariate structure of the data set
551-570
Forster, Jonathan J.
e3c534ad-fa69-42f5-b67b-11617bc84879
Webb, Emily L.
4b686dfe-80d6-4074-ba1b-2adacd253b8b
November 2007
Forster, Jonathan J.
e3c534ad-fa69-42f5-b67b-11617bc84879
Webb, Emily L.
4b686dfe-80d6-4074-ba1b-2adacd253b8b
Forster, Jonathan J. and Webb, Emily L.
(2007)
Bayesian disclosure risk assessment: predicting small frequencies in contingency tables.
Journal of the Royal Statistical Society: Series C (Applied Statistics), 56 (5), .
(doi:10.1111/j.1467-9876.2007.00591.x).
Abstract
We propose an approach for assessing the risk of individual identification in the release of categorical data. This requires the accurate calculation of predictive probabilities for those cells in a contingency table which have small sample frequencies, making the problem somewhat different from usual contingency table estimation, where interest is generally focused on regions of high probability. Our approach is Bayesian and provides posterior predictive probabilities of identification risk. By incorporating model uncertainty in our analysis, we can provide more realistic estimates of disclosure risk for individual cell counts than are provided by methods which ignore the multivariate structure of the data set
This record has no associated files available for download.
More information
Published date: November 2007
Organisations:
Statistics, Southampton Statistical Research Inst.
Identifiers
Local EPrints ID: 46339
URI: http://eprints.soton.ac.uk/id/eprint/46339
ISSN: 0035-9254
PURE UUID: ebd3f836-9251-4c6f-93ec-62f4d8928563
Catalogue record
Date deposited: 19 Jun 2007
Last modified: 16 Mar 2024 02:45
Export record
Altmetrics
Contributors
Author:
Jonathan J. Forster
Author:
Emily L. Webb
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics