Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model
Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model
Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.
Bayesian Additive Regression Tree, Bayesian dasymetric population, Ghana, Population disaggregation, Population modelling, Random forest, WorldPop
Yankey, Ortis
9965d053-8afb-462f-b7fe-b270e21f2ec1
Utazi, Chigozie E.
4781b507-0fb8-45f3-a121-71003a2a2670
Nnanatu, Christopher C.
24be7c1b-a677-4086-91b4-a9d9b1efa5a3
Gadiaga, Assane N.
eada3464-b0a2-4aaa-b594-eff8182c2aee
Abbot, Thomas
d908bcc5-f3e5-41eb-8857-4cb4213d3679
Lazar, Attila N.
d7f835e7-1e3d-4742-b366-af19cf5fc881
Tatem, Andrew J.
6c6de104-a5f9-46e0-bb93-a1a7c980513e
14 September 2024
Yankey, Ortis
9965d053-8afb-462f-b7fe-b270e21f2ec1
Utazi, Chigozie E.
4781b507-0fb8-45f3-a121-71003a2a2670
Nnanatu, Christopher C.
24be7c1b-a677-4086-91b4-a9d9b1efa5a3
Gadiaga, Assane N.
eada3464-b0a2-4aaa-b594-eff8182c2aee
Abbot, Thomas
d908bcc5-f3e5-41eb-8857-4cb4213d3679
Lazar, Attila N.
d7f835e7-1e3d-4742-b366-af19cf5fc881
Tatem, Andrew J.
6c6de104-a5f9-46e0-bb93-a1a7c980513e
Yankey, Ortis, Utazi, Chigozie E., Nnanatu, Christopher C., Gadiaga, Assane N., Abbot, Thomas, Lazar, Attila N. and Tatem, Andrew J.
(2024)
Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model.
Applied Geography, 172, [103416].
(doi:10.1016/j.apgeog.2024.103416).
Abstract
Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.
Text
1-s2.0-S0143622824002212-main
- Version of Record
More information
Accepted/In Press date: 7 September 2024
e-pub ahead of print date: 14 September 2024
Published date: 14 September 2024
Keywords:
Bayesian Additive Regression Tree, Bayesian dasymetric population, Ghana, Population disaggregation, Population modelling, Random forest, WorldPop
Identifiers
Local EPrints ID: 494801
URI: http://eprints.soton.ac.uk/id/eprint/494801
ISSN: 0143-6228
PURE UUID: a12a3857-c1e2-455d-b156-083df4ea756a
Catalogue record
Date deposited: 15 Oct 2024 17:02
Last modified: 16 Oct 2024 02:08
Export record
Altmetrics
Contributors
Author:
Ortis Yankey
Author:
Chigozie E. Utazi
Author:
Christopher C. Nnanatu
Author:
Assane N. Gadiaga
Author:
Thomas Abbot
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics