The University of Southampton
University of Southampton Institutional Repository

Zone design for statistical disclosure control in administrative and linked microdata

Zone design for statistical disclosure control in administrative and linked microdata
Zone design for statistical disclosure control in administrative and linked microdata
The increase in spatially-referenced administrative and linked datasets presents growing challenges for statistical disclosure control. Such new forms of data typically contain both attribute detail and a large data volume, therefore increasing the risk of disclosure of information about individuals and enabling identification. Detailed spatial information may be important to the researcher but also increases risk. This paper is concerned with application of automated zone design tools to protect record-level datasets in a way that might be implemented by a data provider. Implementation could facilitate release of richer data to researchers preserving small area geographical associations, while not revealing actual locations. Using a synthetic microdataset of individual records with locality-level (MSOA) geography codes for England and Wales (variables: age, gender, economic activity, marital status, occupation, number of hours worked and general health), we synthesize address-level locations with reference to 2011 Census headcount data. These synthetic locations are then associated with a range of spatial measures and indicators (e.g. distance to GP). Implementation of the AZTool zone design software enables a bespoke, non-disclosive zone design solution, providing area codes that can be added to the research data without revealing actual locations to the researcher. Results will explain the spatial characteristics of the new synthetic dataset (which may have broader utility) and show changing risk of disclosure and utility when coding to spatial units from different scales and aggregations. Using the synthetic dataset will demonstrate the utility of the approach for a variety of linked and administrative data without any disclosure risk.
Robards, James
4c79fa72-e722-4a2a-a289-1d2bad2c2343
Martin, David
e5c52473-e9f0-4f09-b64c-fa32194b162f
Gale, Chris
5e6578ce-b9cf-4173-aad8-4c5cbd6c3696
Robards, James
4c79fa72-e722-4a2a-a289-1d2bad2c2343
Martin, David
e5c52473-e9f0-4f09-b64c-fa32194b162f
Gale, Chris
5e6578ce-b9cf-4173-aad8-4c5cbd6c3696

Robards, James, Martin, David and Gale, Chris (2016) Zone design for statistical disclosure control in administrative and linked microdata At 2016 British Society for Population Studies Conference, United Kingdom. 12 - 14 Sep 2016.

Record type: Conference or Workshop Item (Other)

Abstract

The increase in spatially-referenced administrative and linked datasets presents growing challenges for statistical disclosure control. Such new forms of data typically contain both attribute detail and a large data volume, therefore increasing the risk of disclosure of information about individuals and enabling identification. Detailed spatial information may be important to the researcher but also increases risk. This paper is concerned with application of automated zone design tools to protect record-level datasets in a way that might be implemented by a data provider. Implementation could facilitate release of richer data to researchers preserving small area geographical associations, while not revealing actual locations. Using a synthetic microdataset of individual records with locality-level (MSOA) geography codes for England and Wales (variables: age, gender, economic activity, marital status, occupation, number of hours worked and general health), we synthesize address-level locations with reference to 2011 Census headcount data. These synthetic locations are then associated with a range of spatial measures and indicators (e.g. distance to GP). Implementation of the AZTool zone design software enables a bespoke, non-disclosive zone design solution, providing area codes that can be added to the research data without revealing actual locations to the researcher. Results will explain the spatial characteristics of the new synthetic dataset (which may have broader utility) and show changing risk of disclosure and utility when coding to spatial units from different scales and aggregations. Using the synthetic dataset will demonstrate the utility of the approach for a variety of linked and administrative data without any disclosure risk.

Full text not available from this repository.

More information

Submitted date: 11 April 2016
e-pub ahead of print date: 14 September 2016
Venue - Dates: 2016 British Society for Population Studies Conference, United Kingdom, 2016-09-12 - 2016-09-14
Organisations: Social Statistics & Demography

Identifiers

Local EPrints ID: 403266
URI: https://eprints.soton.ac.uk/id/eprint/403266
PURE UUID: b35a13a7-809c-4d3d-8a15-5dda9559b97c
ORCID for James Robards: ORCID iD orcid.org/0000-0003-4784-5679
ORCID for David Martin: ORCID iD orcid.org/0000-0003-0397-0769

Catalogue record

Date deposited: 29 Nov 2016 13:43
Last modified: 17 Jul 2017 17:43

Export record

Contributors

Author: James Robards ORCID iD
Author: David Martin ORCID iD
Author: Chris Gale

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff edit
Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of https://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×