The University of Southampton
University of Southampton Institutional Repository

Pseudonymization risk analysis in distributed systems

Pseudonymization risk analysis in distributed systems
Pseudonymization risk analysis in distributed systems
In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm.
Privacy, Risk Analysis, pseudonymisation
1867-4828
Neumann, Geoffrey
9dfe6611-52bb-4ba6-ad83-b92c7acb4bb3
Grace, Paul
b48ef8f2-b116-48ce-b774-4d43808cc02f
Burns, Daniel
40b9dc88-a54a-4365-b747-4456d9203146
Surridge, Michael
3bd360fa-1962-4992-bb16-12fc4dd7d9a9
Neumann, Geoffrey
9dfe6611-52bb-4ba6-ad83-b92c7acb4bb3
Grace, Paul
b48ef8f2-b116-48ce-b774-4d43808cc02f
Burns, Daniel
40b9dc88-a54a-4365-b747-4456d9203146
Surridge, Michael
3bd360fa-1962-4992-bb16-12fc4dd7d9a9

Neumann, Geoffrey, Grace, Paul, Burns, Daniel and Surridge, Michael (2019) Pseudonymization risk analysis in distributed systems. Journal of Internet Services and Applications, 10 (1). (doi:10.1186/s13174-018-0098-z).

Record type: Article

Abstract

In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm.

Text
jisa_iti_2018_v2 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (720kB)

More information

Accepted/In Press date: 24 October 2018
e-pub ahead of print date: 8 January 2019
Keywords: Privacy, Risk Analysis, pseudonymisation

Identifiers

Local EPrints ID: 425740
URI: http://eprints.soton.ac.uk/id/eprint/425740
ISSN: 1867-4828
PURE UUID: fd49fdbe-f89c-4121-8277-54e488097ddd
ORCID for Paul Grace: ORCID iD orcid.org/0000-0003-2363-0630
ORCID for Daniel Burns: ORCID iD orcid.org/0000-0001-6976-1068
ORCID for Michael Surridge: ORCID iD orcid.org/0000-0003-1485-7024

Catalogue record

Date deposited: 02 Nov 2018 17:30
Last modified: 26 Aug 2024 01:32

Export record

Altmetrics

Contributors

Author: Geoffrey Neumann
Author: Paul Grace ORCID iD
Author: Daniel Burns ORCID iD
Author: Michael Surridge ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×