The University of Southampton
University of Southampton Institutional Repository

Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model

Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model
Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model

Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.

Bayesian Additive Regression Tree, Bayesian dasymetric population, Ghana, Population disaggregation, Population modelling, Random forest, WorldPop
0143-6228
Yankey, Ortis
9965d053-8afb-462f-b7fe-b270e21f2ec1
Utazi, Chigozie E.
4781b507-0fb8-45f3-a121-71003a2a2670
Nnanatu, Christopher C.
24be7c1b-a677-4086-91b4-a9d9b1efa5a3
Gadiaga, Assane N.
eada3464-b0a2-4aaa-b594-eff8182c2aee
Abbot, Thomas
d908bcc5-f3e5-41eb-8857-4cb4213d3679
Lazar, Attila N.
d7f835e7-1e3d-4742-b366-af19cf5fc881
Tatem, Andrew J.
6c6de104-a5f9-46e0-bb93-a1a7c980513e
Yankey, Ortis
9965d053-8afb-462f-b7fe-b270e21f2ec1
Utazi, Chigozie E.
4781b507-0fb8-45f3-a121-71003a2a2670
Nnanatu, Christopher C.
24be7c1b-a677-4086-91b4-a9d9b1efa5a3
Gadiaga, Assane N.
eada3464-b0a2-4aaa-b594-eff8182c2aee
Abbot, Thomas
d908bcc5-f3e5-41eb-8857-4cb4213d3679
Lazar, Attila N.
d7f835e7-1e3d-4742-b366-af19cf5fc881
Tatem, Andrew J.
6c6de104-a5f9-46e0-bb93-a1a7c980513e

Yankey, Ortis, Utazi, Chigozie E., Nnanatu, Christopher C., Gadiaga, Assane N., Abbot, Thomas, Lazar, Attila N. and Tatem, Andrew J. (2024) Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model. Applied Geography, 172, [103416]. (doi:10.1016/j.apgeog.2024.103416).

Record type: Article

Abstract

Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.

Text
1-s2.0-S0143622824002212-main - Version of Record
Available under License Creative Commons Attribution.
Download (16MB)

More information

Accepted/In Press date: 7 September 2024
e-pub ahead of print date: 14 September 2024
Published date: 14 September 2024
Keywords: Bayesian Additive Regression Tree, Bayesian dasymetric population, Ghana, Population disaggregation, Population modelling, Random forest, WorldPop

Identifiers

Local EPrints ID: 494801
URI: http://eprints.soton.ac.uk/id/eprint/494801
ISSN: 0143-6228
PURE UUID: a12a3857-c1e2-455d-b156-083df4ea756a
ORCID for Ortis Yankey: ORCID iD orcid.org/0000-0002-0808-884X
ORCID for Christopher C. Nnanatu: ORCID iD orcid.org/0000-0002-5841-3700
ORCID for Attila N. Lazar: ORCID iD orcid.org/0000-0003-2033-2013
ORCID for Andrew J. Tatem: ORCID iD orcid.org/0000-0002-7270-941X

Catalogue record

Date deposited: 15 Oct 2024 17:02
Last modified: 16 Oct 2024 02:08

Export record

Altmetrics

Contributors

Author: Ortis Yankey ORCID iD
Author: Chigozie E. Utazi
Author: Christopher C. Nnanatu ORCID iD
Author: Assane N. Gadiaga
Author: Thomas Abbot
Author: Attila N. Lazar ORCID iD
Author: Andrew J. Tatem ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×