The University of Southampton
University of Southampton Institutional Repository

Kernel density estimation under masking of geolocations with applications to DHS data

Kernel density estimation under masking of geolocations with applications to DHS data
Kernel density estimation under masking of geolocations with applications to DHS data
The availability of geocoordinates offers valuable insights into spatial patterns of economic, demographic and health outcomes. However, disclosing the exact geolocation of statistical units to secondary analysts contravenes the responsible use of data. To protect privacy, anonymisation methods are used. A commonly applied anonymisation method is the one used by Demographic and Health Surveys (DHS). The DHS anonymisation scheme works by first aggregating data at small spatial units followed by random (donut) displacement of the geocoordinates. It is reasonable for secondary analysts to be concerned about the impact of anonymisation on the analyses. In this paper, the DHS anonymisation scheme is used as a basis for studying how anonymisation impacts on kernel density estimation. We propose methodology to account for the impact of the anonymisation process on density estimation. The proposed methodology is based on deriving the distribution of the true coordinates given the observed (anonymised) coordinates. Density estimation is then implemented by using the theoretical distribution and an iterative algorithm that accounts for both aggregation and displacement. The aim is to approximate the original population density using generated pseudo-coordinates under the assumption that the anonymisation process is known. The proposed method is illustrated by using DHS data from the Rajshahi Division in Bangladesh to estimate the density of households below the poverty line. The results show that accounting for measurement error due to anonymisation leads to a more accurate picture of the spatial distribution of poverty.
2026/3
Freie Universität Berlin
Gril, Lorena
eb6215a9-99e1-4de6-829c-8fd43d0c8900
Hossain, Md Jamal
3b4f5a47-c0a3-407b-88c0-ec936e70faf3
Tzavidis, Nikos
431ec55d-c147-466d-9c65-0f377b0c1f6a
Rendtel, Ulrich
d91013c3-f069-4682-b43b-94d85b1a86a7
Gril, Lorena
eb6215a9-99e1-4de6-829c-8fd43d0c8900
Hossain, Md Jamal
3b4f5a47-c0a3-407b-88c0-ec936e70faf3
Tzavidis, Nikos
431ec55d-c147-466d-9c65-0f377b0c1f6a
Rendtel, Ulrich
d91013c3-f069-4682-b43b-94d85b1a86a7

Gril, Lorena, Hossain, Md Jamal, Tzavidis, Nikos and Rendtel, Ulrich (2026) Kernel density estimation under masking of geolocations with applications to DHS data Freie Universität Berlin 37pp. (doi:10.17169/refubium-51278).

Record type: Monograph (Discussion Paper)

Abstract

The availability of geocoordinates offers valuable insights into spatial patterns of economic, demographic and health outcomes. However, disclosing the exact geolocation of statistical units to secondary analysts contravenes the responsible use of data. To protect privacy, anonymisation methods are used. A commonly applied anonymisation method is the one used by Demographic and Health Surveys (DHS). The DHS anonymisation scheme works by first aggregating data at small spatial units followed by random (donut) displacement of the geocoordinates. It is reasonable for secondary analysts to be concerned about the impact of anonymisation on the analyses. In this paper, the DHS anonymisation scheme is used as a basis for studying how anonymisation impacts on kernel density estimation. We propose methodology to account for the impact of the anonymisation process on density estimation. The proposed methodology is based on deriving the distribution of the true coordinates given the observed (anonymised) coordinates. Density estimation is then implemented by using the theoretical distribution and an iterative algorithm that accounts for both aggregation and displacement. The aim is to approximate the original population density using generated pseudo-coordinates under the assumption that the anonymisation process is known. The proposed method is illustrated by using DHS data from the Rajshahi Division in Bangladesh to estimate the density of households below the poverty line. The results show that accounting for measurement error due to anonymisation leads to a more accurate picture of the spatial distribution of poverty.

Text
discpaper2026_3 - Version of Record
Available under License Creative Commons Attribution.
Download (19MB)

More information

Published date: 13 February 2026

Identifiers

Local EPrints ID: 511534
URI: http://eprints.soton.ac.uk/id/eprint/511534
PURE UUID: 5b835728-a4e4-4e98-aa05-2b7e58011ae5
ORCID for Md Jamal Hossain: ORCID iD orcid.org/0000-0002-2728-1055
ORCID for Nikos Tzavidis: ORCID iD orcid.org/0000-0002-8413-8095

Catalogue record

Date deposited: 19 May 2026 16:45
Last modified: 21 May 2026 02:06

Export record

Altmetrics

Contributors

Author: Lorena Gril
Author: Md Jamal Hossain ORCID iD
Author: Nikos Tzavidis ORCID iD
Author: Ulrich Rendtel

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×