Should data be partitioned spatially before building large-scale distribution models?
Should data be partitioned spatially before building large-scale distribution models?
There is growing interest in building predictive models of species distributions over large geographic areas. As larger areas are modelled, however, it is highly likely that heterogeneity in the predictors variable increases and that areas are included where animals respond to habitats in different ways, for example, due to social status. These effects (spatial non-stationary) may weaken model performance. This paper explores whether data partitioning prior to analysis can improve the tit of models and provide ecological insight into distribution patterns. Data on three bird species were modelled for the whole of Spain at 1 km(2) resolution using logistic regression analysis. Data were partitioned into geographic quarters, concentric rings around the centroid of the distribution, and into random samples for comparison. In all cases, data partitioning produced better models as assessed by Receiver Operating Characteristic curve (AUC) statistics than analysis of the global data set. Inclusion of latitude and longitude improved the global models only when added as smoothed splines but produced different probabilities to the partitioned data. Geographic partitioning is a very crude local modelling approach and we suggest that some form of geographically-weighted regression could offer the best solution to large-scale modelling but is computationally intensive on Geographical Information Systems (GIs) data. It is concluded that simple partitioning by geographic quarters may detect spatial non-stationary and alert the modeller to possible problems; that partitioning into more novel arrangements may be used to test ecological hypotheses; and that data should not be partitioned spatially to build and test models if non-stationary is suspected.
distribution models, spatial non-stationary, logistic regression, spatial heterogeneity, birds, spain
249-259
Osborne, P.E.
c4d4261d-557c-4179-a24e-cdd7a98fb2b8
Suarez-Seoane, S.
a712fb65-42d9-4fa8-a163-9bdcfa4fd052
2002
Osborne, P.E.
c4d4261d-557c-4179-a24e-cdd7a98fb2b8
Suarez-Seoane, S.
a712fb65-42d9-4fa8-a163-9bdcfa4fd052
Osborne, P.E. and Suarez-Seoane, S.
(2002)
Should data be partitioned spatially before building large-scale distribution models?
Ecological Modelling, 157 (2), .
(doi:10.1016/S0304-3800(02)00198-9).
Abstract
There is growing interest in building predictive models of species distributions over large geographic areas. As larger areas are modelled, however, it is highly likely that heterogeneity in the predictors variable increases and that areas are included where animals respond to habitats in different ways, for example, due to social status. These effects (spatial non-stationary) may weaken model performance. This paper explores whether data partitioning prior to analysis can improve the tit of models and provide ecological insight into distribution patterns. Data on three bird species were modelled for the whole of Spain at 1 km(2) resolution using logistic regression analysis. Data were partitioned into geographic quarters, concentric rings around the centroid of the distribution, and into random samples for comparison. In all cases, data partitioning produced better models as assessed by Receiver Operating Characteristic curve (AUC) statistics than analysis of the global data set. Inclusion of latitude and longitude improved the global models only when added as smoothed splines but produced different probabilities to the partitioned data. Geographic partitioning is a very crude local modelling approach and we suggest that some form of geographically-weighted regression could offer the best solution to large-scale modelling but is computationally intensive on Geographical Information Systems (GIs) data. It is concluded that simple partitioning by geographic quarters may detect spatial non-stationary and alert the modeller to possible problems; that partitioning into more novel arrangements may be used to test ecological hypotheses; and that data should not be partitioned spatially to build and test models if non-stationary is suspected.
This record has no associated files available for download.
More information
Published date: 2002
Keywords:
distribution models, spatial non-stationary, logistic regression, spatial heterogeneity, birds, spain
Identifiers
Local EPrints ID: 39469
URI: http://eprints.soton.ac.uk/id/eprint/39469
ISSN: 0304-3800
PURE UUID: 2dfb88b3-4b1e-4b83-9fc2-0c7c02873330
Catalogue record
Date deposited: 28 Jun 2006
Last modified: 16 Mar 2024 03:42
Export record
Altmetrics
Contributors
Author:
S. Suarez-Seoane
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics