The University of Southampton
University of Southampton Institutional Repository

Should data be partitioned spatially before building large-scale distribution models?

Should data be partitioned spatially before building large-scale distribution models?
Should data be partitioned spatially before building large-scale distribution models?
There is growing interest in building predictive models of species distributions over large geographic areas. As larger areas are modelled, however, it is highly likely that heterogeneity in the predictors variable increases and that areas are included where animals respond to habitats in different ways, for example, due to social status. These effects (spatial non-stationary) may weaken model performance. This paper explores whether data partitioning prior to analysis can improve the tit of models and provide ecological insight into distribution patterns. Data on three bird species were modelled for the whole of Spain at 1 km(2) resolution using logistic regression analysis. Data were partitioned into geographic quarters, concentric rings around the centroid of the distribution, and into random samples for comparison. In all cases, data partitioning produced better models as assessed by Receiver Operating Characteristic curve (AUC) statistics than analysis of the global data set. Inclusion of latitude and longitude improved the global models only when added as smoothed splines but produced different probabilities to the partitioned data. Geographic partitioning is a very crude local modelling approach and we suggest that some form of geographically-weighted regression could offer the best solution to large-scale modelling but is computationally intensive on Geographical Information Systems (GIs) data. It is concluded that simple partitioning by geographic quarters may detect spatial non-stationary and alert the modeller to possible problems; that partitioning into more novel arrangements may be used to test ecological hypotheses; and that data should not be partitioned spatially to build and test models if non-stationary is suspected.
distribution models, spatial non-stationary, logistic regression, spatial heterogeneity, birds, spain
0304-3800
249-259
Osborne, P.E.
c4d4261d-557c-4179-a24e-cdd7a98fb2b8
Suarez-Seoane, S.
a712fb65-42d9-4fa8-a163-9bdcfa4fd052
Osborne, P.E.
c4d4261d-557c-4179-a24e-cdd7a98fb2b8
Suarez-Seoane, S.
a712fb65-42d9-4fa8-a163-9bdcfa4fd052

Osborne, P.E. and Suarez-Seoane, S. (2002) Should data be partitioned spatially before building large-scale distribution models? Ecological Modelling, 157 (2), 249-259. (doi:10.1016/S0304-3800(02)00198-9).

Record type: Article

Abstract

There is growing interest in building predictive models of species distributions over large geographic areas. As larger areas are modelled, however, it is highly likely that heterogeneity in the predictors variable increases and that areas are included where animals respond to habitats in different ways, for example, due to social status. These effects (spatial non-stationary) may weaken model performance. This paper explores whether data partitioning prior to analysis can improve the tit of models and provide ecological insight into distribution patterns. Data on three bird species were modelled for the whole of Spain at 1 km(2) resolution using logistic regression analysis. Data were partitioned into geographic quarters, concentric rings around the centroid of the distribution, and into random samples for comparison. In all cases, data partitioning produced better models as assessed by Receiver Operating Characteristic curve (AUC) statistics than analysis of the global data set. Inclusion of latitude and longitude improved the global models only when added as smoothed splines but produced different probabilities to the partitioned data. Geographic partitioning is a very crude local modelling approach and we suggest that some form of geographically-weighted regression could offer the best solution to large-scale modelling but is computationally intensive on Geographical Information Systems (GIs) data. It is concluded that simple partitioning by geographic quarters may detect spatial non-stationary and alert the modeller to possible problems; that partitioning into more novel arrangements may be used to test ecological hypotheses; and that data should not be partitioned spatially to build and test models if non-stationary is suspected.

This record has no associated files available for download.

More information

Published date: 2002
Keywords: distribution models, spatial non-stationary, logistic regression, spatial heterogeneity, birds, spain

Identifiers

Local EPrints ID: 39469
URI: http://eprints.soton.ac.uk/id/eprint/39469
ISSN: 0304-3800
PURE UUID: 2dfb88b3-4b1e-4b83-9fc2-0c7c02873330
ORCID for P.E. Osborne: ORCID iD orcid.org/0000-0001-8919-5710

Catalogue record

Date deposited: 28 Jun 2006
Last modified: 16 Mar 2024 03:42

Export record

Altmetrics

Contributors

Author: P.E. Osborne ORCID iD
Author: S. Suarez-Seoane

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×