The University of Southampton
University of Southampton Institutional Repository

Secure big data collection and processing: framework, means and opportunities

Secure big data collection and processing: framework, means and opportunities
Secure big data collection and processing: framework, means and opportunities
Statistical disclosure control is important for the dissemination of statistical outputs. There is an increasing need for greater confidentiality protection during data collection and processing by National Statistical Offices. In particular, various transactions and remote sensing signals are examples of useful but very detailed big data that can be highly sensitive. Moreover, possible conflicts of interest may arise for data suppliers who operate commercially. In this paper, we formulate statistical disclosure control for data collection and processing as an optimisation problem. Even when it is difficult to specify and solve the problem unequivocally, the formulation can still provide the basis for comparing different disclosure control methods. We develop a general compartmented system that adapts and implements non-perturbative methods in the related fields of linking sensitive data and secure computation. We illustrate how the system can be configured to yield variously required tables and microdata sets with sufficiently low disclosure risks.
Non-survey big data, statistical disclosure control, confidentiality protection, trusted execution environment
0964-1998
1541–1559
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
Haraldsen, Gustav
bde26eec-8298-4ba1-952e-42445239763a
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
Haraldsen, Gustav
bde26eec-8298-4ba1-952e-42445239763a

Zhang, Li-Chun and Haraldsen, Gustav (2022) Secure big data collection and processing: framework, means and opportunities. Journal of the Royal Statistical Society: Series A (Statistics in Society), 185 (4), 1541–1559. (doi:10.1111/rssa.12836).

Record type: Article

Abstract

Statistical disclosure control is important for the dissemination of statistical outputs. There is an increasing need for greater confidentiality protection during data collection and processing by National Statistical Offices. In particular, various transactions and remote sensing signals are examples of useful but very detailed big data that can be highly sensitive. Moreover, possible conflicts of interest may arise for data suppliers who operate commercially. In this paper, we formulate statistical disclosure control for data collection and processing as an optimisation problem. Even when it is difficult to specify and solve the problem unequivocally, the formulation can still provide the basis for comparing different disclosure control methods. We develop a general compartmented system that adapts and implements non-perturbative methods in the related fields of linking sensitive data and secure computation. We illustrate how the system can be configured to yield variously required tables and microdata sets with sufficiently low disclosure risks.

Text
secureCollectionProcessing-ZhangHaraldsen-Accepted - Accepted Manuscript
Download (403kB)

More information

Accepted/In Press date: 14 February 2022
e-pub ahead of print date: 25 March 2022
Published date: 1 October 2022
Keywords: Non-survey big data, statistical disclosure control, confidentiality protection, trusted execution environment

Identifiers

Local EPrints ID: 454993
URI: http://eprints.soton.ac.uk/id/eprint/454993
ISSN: 0964-1998
PURE UUID: ee70669c-61b9-4ddc-b659-d6f6be729846
ORCID for Li-Chun Zhang: ORCID iD orcid.org/0000-0002-3944-9484

Catalogue record

Date deposited: 03 Mar 2022 17:37
Last modified: 17 Mar 2024 07:08

Export record

Altmetrics

Contributors

Author: Li-Chun Zhang ORCID iD
Author: Gustav Haraldsen

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×