Multilevel Dirichlet process mixture analysis of railway grade crossing crash data
Multilevel Dirichlet process mixture analysis of railway grade crossing crash data
This article introduces a flexible Bayesian semiparametric approach to analyzing crash data that are of hierarchical or multilevel nature. We extend the traditional varying intercept (random effects) multilevel model by relaxing its standard parametric distributional assumption. While accounting for unobserved cross-group heterogeneity in the data through intercept, the proposed method allows identifying latent subpopulations (and consequently outliers) in data based on a Dirichlet process mixture. It also allows estimating the number of latent subpopulations using an elegant mathematical structure instead of prespecifying this number arbitrarily as in conventional latent class or finite mixture models. In this paper, we evaluate our method on two recent railway grade crossing crash datasets, at province and municipality levels, from Canada for the years 2008-2013. We use cross-validation predictive densities and pseudo-Bayes factor for Bayesian model selection. While confirming the need for a multilevel modeling approach for both datasets, the results reveal the inadequacy of the standard parametric assumption in the varying intercept model for the municipality-level dataset. In fact, our proposed method is shown to improve model fitting significantly for the latter data. In a fully probabilistic framework, we also identify the expected number of latent clusters that share similar unidentified features among Canadian provinces and municipalities. It is possible thus to further investigate the reasons for such similarities and dissimilarities. This can have important policy implications for various safety management programs.
Dirichlet process mixture models, Finite mixture models, Latent subpopulations, Random effects models, Spatial/regional multilevel models, Unobserved heterogeneity
27-43
Heydari, Shahram
0d12a583-a4e8-4888-9e51-a50d312be1e9
Fu, Liping
5a8cfcc4-d76e-4456-b4e0-7877de2a0eb1
Lord, Dominique
968f94d4-a988-4bb2-a3f2-613c487a3f3a
Mallick, Bani K.
95794104-bc97-46a9-88ef-ec582c9cde24
1 March 2016
Heydari, Shahram
0d12a583-a4e8-4888-9e51-a50d312be1e9
Fu, Liping
5a8cfcc4-d76e-4456-b4e0-7877de2a0eb1
Lord, Dominique
968f94d4-a988-4bb2-a3f2-613c487a3f3a
Mallick, Bani K.
95794104-bc97-46a9-88ef-ec582c9cde24
Heydari, Shahram, Fu, Liping, Lord, Dominique and Mallick, Bani K.
(2016)
Multilevel Dirichlet process mixture analysis of railway grade crossing crash data.
Analytic Methods in Accident Research, 9, .
(doi:10.1016/j.amar.2016.02.001).
Abstract
This article introduces a flexible Bayesian semiparametric approach to analyzing crash data that are of hierarchical or multilevel nature. We extend the traditional varying intercept (random effects) multilevel model by relaxing its standard parametric distributional assumption. While accounting for unobserved cross-group heterogeneity in the data through intercept, the proposed method allows identifying latent subpopulations (and consequently outliers) in data based on a Dirichlet process mixture. It also allows estimating the number of latent subpopulations using an elegant mathematical structure instead of prespecifying this number arbitrarily as in conventional latent class or finite mixture models. In this paper, we evaluate our method on two recent railway grade crossing crash datasets, at province and municipality levels, from Canada for the years 2008-2013. We use cross-validation predictive densities and pseudo-Bayes factor for Bayesian model selection. While confirming the need for a multilevel modeling approach for both datasets, the results reveal the inadequacy of the standard parametric assumption in the varying intercept model for the municipality-level dataset. In fact, our proposed method is shown to improve model fitting significantly for the latter data. In a fully probabilistic framework, we also identify the expected number of latent clusters that share similar unidentified features among Canadian provinces and municipalities. It is possible thus to further investigate the reasons for such similarities and dissimilarities. This can have important policy implications for various safety management programs.
This record has no associated files available for download.
More information
Accepted/In Press date: 5 February 2016
e-pub ahead of print date: 27 February 2016
Published date: 1 March 2016
Keywords:
Dirichlet process mixture models, Finite mixture models, Latent subpopulations, Random effects models, Spatial/regional multilevel models, Unobserved heterogeneity
Identifiers
Local EPrints ID: 424169
URI: http://eprints.soton.ac.uk/id/eprint/424169
ISSN: 2213-6657
PURE UUID: 7d68c7d8-40d7-44ff-90dd-12f63187221e
Catalogue record
Date deposited: 05 Oct 2018 11:31
Last modified: 17 Mar 2024 12:11
Export record
Altmetrics
Contributors
Author:
Liping Fu
Author:
Dominique Lord
Author:
Bani K. Mallick
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics