The University of Southampton
University of Southampton Institutional Repository

On bandwidth choice for spatial data density estimation

On bandwidth choice for spatial data density estimation
On bandwidth choice for spatial data density estimation
Bandwidth choice is crucial in spatial kernel estimation in exploring non-
Gaussian complex spatial data. This paper investigates the choice of adaptive and
non-adaptive bandwidths for density estimation given data on a spatial lattice. An
adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the CV idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite-sample performance of the SCV adaptive bandwidth choice works rather well. It outperforms the existing R-routines such as the `rule of thumb' and the so-called `second-generation'
Sheather-Jones bandwidths for moderate and big data. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.
Cross-validation, Kernel density estimation, Optimal bandwidth, Spatial lattice data, Spatially adaptive bandwidth choice
1467-9868
817-840
Jiang, Zhenyu
940e16c6-ad5d-49c3-be60-a1595224d77e
Ling, Nengxiang
1b752056-2933-4148-a245-fb5b74d812e0
Lu, Zudi
4aa7d988-ac2b-4150-a586-ca92b8adda95
Tjøstheim, Dag
13b95e48-8f1f-44e8-95dd-8527a1897ff6
Zhang, Qiang
a956c138-e3b3-4305-b8ce-8776c5e124f4
Jiang, Zhenyu
940e16c6-ad5d-49c3-be60-a1595224d77e
Ling, Nengxiang
1b752056-2933-4148-a245-fb5b74d812e0
Lu, Zudi
4aa7d988-ac2b-4150-a586-ca92b8adda95
Tjøstheim, Dag
13b95e48-8f1f-44e8-95dd-8527a1897ff6
Zhang, Qiang
a956c138-e3b3-4305-b8ce-8776c5e124f4

Jiang, Zhenyu, Ling, Nengxiang, Lu, Zudi, Tjøstheim, Dag and Zhang, Qiang (2020) On bandwidth choice for spatial data density estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82 (3), 817-840. (doi:10.1111/rssb.12367).

Record type: Article

Abstract

Bandwidth choice is crucial in spatial kernel estimation in exploring non-
Gaussian complex spatial data. This paper investigates the choice of adaptive and
non-adaptive bandwidths for density estimation given data on a spatial lattice. An
adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the CV idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite-sample performance of the SCV adaptive bandwidth choice works rather well. It outperforms the existing R-routines such as the `rule of thumb' and the so-called `second-generation'
Sheather-Jones bandwidths for moderate and big data. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.

Text
JLLcv-22 - Accepted Manuscript
Download (880kB)

More information

Accepted/In Press date: 12 February 2020
e-pub ahead of print date: 21 April 2020
Published date: 1 July 2020
Keywords: Cross-validation, Kernel density estimation, Optimal bandwidth, Spatial lattice data, Spatially adaptive bandwidth choice

Identifiers

Local EPrints ID: 438434
URI: http://eprints.soton.ac.uk/id/eprint/438434
ISSN: 1467-9868
PURE UUID: b6e0fdf9-1085-405d-83c1-818ef56fdd5f
ORCID for Zudi Lu: ORCID iD orcid.org/0000-0003-0893-832X

Catalogue record

Date deposited: 10 Mar 2020 17:30
Last modified: 17 Mar 2024 05:23

Export record

Altmetrics

Contributors

Author: Zhenyu Jiang
Author: Nengxiang Ling
Author: Zudi Lu ORCID iD
Author: Dag Tjøstheim
Author: Qiang Zhang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×