The University of Southampton
University of Southampton Institutional Repository

Random forests and mixed effects random forests for small area estimation of general parameters: a poverty mapping case study in Mozambique

Random forests and mixed effects random forests for small area estimation of general parameters: a poverty mapping case study in Mozambique
Random forests and mixed effects random forests for small area estimation of general parameters: a poverty mapping case study in Mozambique
Use of standard random forests may not guarantee reliable small area estimates unless a rich source of predictors explains the between-area heterogeneity. We propose mixed effects random forests with area random effects for small area estimation of general parameters. A new fitting algorithm with an embedded bootstrap-bias correction for the random forest residual variance is presented. Point estimators of small area parameters are constructed using a smearing estimator of the area-specific distribution function. Nonparametric block bootstrap is used for MSE estimation. The methodology is evaluated using household consumption data from Mozambique to derive district estimates of head count ratio and poverty gap. Comparisons to the empirical best predictor under a linear mixed model and to a synthetic estimator under the random forest are presented. Estimates are further contrasted to 2023 World Bank estimates and to design-unbiased direct estimates. The results show: (a) the advantages from including random effects in random forests, (b) the importance of data transformations for machine learning methods, (c) robustness properties of random forest-type methods, and (d) the importance of bias correcting the naive estimator of the random forest residual variance. Our conclusions demonstrate that a black-box approach to using machine learning methods should be avoided.
1932-6157
809-832
Krennmair, Patrick
8e815c04-8453-4ba8-87f4-c8e743b786e5
Wurz, Nora
c353d8a5-0cb5-4f7c-a0c2-95b1dee16dd2
Schmid, Timo
f27ad6c7-1950-468c-90fe-33e92ad8905f
Tzavidis, Nikos
431ec55d-c147-466d-9c65-0f377b0c1f6a
Krennmair, Patrick
8e815c04-8453-4ba8-87f4-c8e743b786e5
Wurz, Nora
c353d8a5-0cb5-4f7c-a0c2-95b1dee16dd2
Schmid, Timo
f27ad6c7-1950-468c-90fe-33e92ad8905f
Tzavidis, Nikos
431ec55d-c147-466d-9c65-0f377b0c1f6a

Krennmair, Patrick, Wurz, Nora, Schmid, Timo and Tzavidis, Nikos (2026) Random forests and mixed effects random forests for small area estimation of general parameters: a poverty mapping case study in Mozambique. The Annals of Applied Statistics, 20 (1), 809-832. (doi:10.1214/25-AOAS2126).

Record type: Article

Abstract

Use of standard random forests may not guarantee reliable small area estimates unless a rich source of predictors explains the between-area heterogeneity. We propose mixed effects random forests with area random effects for small area estimation of general parameters. A new fitting algorithm with an embedded bootstrap-bias correction for the random forest residual variance is presented. Point estimators of small area parameters are constructed using a smearing estimator of the area-specific distribution function. Nonparametric block bootstrap is used for MSE estimation. The methodology is evaluated using household consumption data from Mozambique to derive district estimates of head count ratio and poverty gap. Comparisons to the empirical best predictor under a linear mixed model and to a synthetic estimator under the random forest are presented. Estimates are further contrasted to 2023 World Bank estimates and to design-unbiased direct estimates. The results show: (a) the advantages from including random effects in random forests, (b) the importance of data transformations for machine learning methods, (c) robustness properties of random forest-type methods, and (d) the importance of bias correcting the naive estimator of the random forest residual variance. Our conclusions demonstrate that a black-box approach to using machine learning methods should be avoided.

Text
Annals_of_Applied_Statistics___MERF_R_R - Accepted Manuscript
Download (1MB)
Text
Annals_of_Applied_Statistics___MERF_Supplementary_material
Restricted to Repository staff only
Request a copy
Text
Krennmair et al - AoAS - Random forests and mixed effects random forests for small area estimation of general parameters
Restricted to Repository staff only
Request a copy

More information

Accepted/In Press date: 3 December 2025
Published date: 1 March 2026

Identifiers

Local EPrints ID: 507380
URI: http://eprints.soton.ac.uk/id/eprint/507380
ISSN: 1932-6157
PURE UUID: 18dcc3c9-b417-42fb-a78b-e6535b6dd856
ORCID for Nikos Tzavidis: ORCID iD orcid.org/0000-0002-8413-8095

Catalogue record

Date deposited: 08 Dec 2025 17:37
Last modified: 28 Apr 2026 04:01

Export record

Altmetrics

Contributors

Author: Patrick Krennmair
Author: Nora Wurz
Author: Timo Schmid
Author: Nikos Tzavidis ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×