The University of Southampton
University of Southampton Institutional Repository

Assessing identification risk in survey microdata using log-linear models

Assessing identification risk in survey microdata using log-linear models
Assessing identification risk in survey microdata using log-linear models
This article considers the assessment of the risk of identification of respondents in survey microdata, in the context of applications at the United Kingdom (UK) Office for National Statistics (ONS). The threat comes from the matching of categorical 'key' variables between microdata records and external data sources and from the use of log-linear models to facilitate matching. While the potential use of such statistical models is well-established in the literature, little consideration has been given to model specification nor to the sensitivity of risk assessment to this specification. In this article we develop new criteria for assessing the specification of a log-linear model in relation to the accuracy of risk estimates. We find that, within a class of 'reasonable' models, risk estimates tend to decrease as the complexity of the model increases. We develop criteria to detect 'underfitting' (associated with overestimation of the risk). The criteria may also reveal 'overfitting' (associated with underestimation) although not so clearly, so we suggest employing a forward model selection approach. We show how our approach may be used for both file-level and record-level measures of risk. We evaluate the proposed procedures using samples drawn from the 2001 UK Census where the true risks can be determined. We also apply our approach to a large survey dataset.
confidentiality, disclosure, key variable, matching, model specification
M06/14
University of Southampton, Southampton Statistical Sciences Research Institute
Skinner, Chris
dec5ef40-49ef-492a-8a1d-eb8c6315b8ce
Shlomo, Natalie
e749febc-b7b9-4017-be48-96d59dd03215
Skinner, Chris
dec5ef40-49ef-492a-8a1d-eb8c6315b8ce
Shlomo, Natalie
e749febc-b7b9-4017-be48-96d59dd03215

Skinner, Chris and Shlomo, Natalie (2006) Assessing identification risk in survey microdata using log-linear models , Southampton, UK University of Southampton, Southampton Statistical Sciences Research Institute 36pp. (S3RI Methodology Working Papers, M06/14).

Record type: Monograph (Working Paper)

Abstract

This article considers the assessment of the risk of identification of respondents in survey microdata, in the context of applications at the United Kingdom (UK) Office for National Statistics (ONS). The threat comes from the matching of categorical 'key' variables between microdata records and external data sources and from the use of log-linear models to facilitate matching. While the potential use of such statistical models is well-established in the literature, little consideration has been given to model specification nor to the sensitivity of risk assessment to this specification. In this article we develop new criteria for assessing the specification of a log-linear model in relation to the accuracy of risk estimates. We find that, within a class of 'reasonable' models, risk estimates tend to decrease as the complexity of the model increases. We develop criteria to detect 'underfitting' (associated with overestimation of the risk). The criteria may also reveal 'overfitting' (associated with underestimation) although not so clearly, so we suggest employing a forward model selection approach. We show how our approach may be used for both file-level and record-level measures of risk. We evaluate the proposed procedures using samples drawn from the 2001 UK Census where the true risks can be determined. We also apply our approach to a large survey dataset.

PDF 41842-01.pdf - Author's Original
Download (662kB)

More information

Published date: 6 October 2006
Keywords: confidentiality, disclosure, key variable, matching, model specification

Identifiers

Local EPrints ID: 41842
URI: http://eprints.soton.ac.uk/id/eprint/41842
PURE UUID: 496fbaa8-5594-489d-b45c-02b83fb2bf37

Catalogue record

Date deposited: 06 Oct 2006
Last modified: 17 Jul 2017 15:27

Export record

Contributors

Author: Chris Skinner
Author: Natalie Shlomo

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×