The University of Southampton
University of Southampton Institutional Repository

Multilevel Dirichlet process mixture analysis of railway grade crossing crash data

Multilevel Dirichlet process mixture analysis of railway grade crossing crash data
Multilevel Dirichlet process mixture analysis of railway grade crossing crash data

This article introduces a flexible Bayesian semiparametric approach to analyzing crash data that are of hierarchical or multilevel nature. We extend the traditional varying intercept (random effects) multilevel model by relaxing its standard parametric distributional assumption. While accounting for unobserved cross-group heterogeneity in the data through intercept, the proposed method allows identifying latent subpopulations (and consequently outliers) in data based on a Dirichlet process mixture. It also allows estimating the number of latent subpopulations using an elegant mathematical structure instead of prespecifying this number arbitrarily as in conventional latent class or finite mixture models. In this paper, we evaluate our method on two recent railway grade crossing crash datasets, at province and municipality levels, from Canada for the years 2008-2013. We use cross-validation predictive densities and pseudo-Bayes factor for Bayesian model selection. While confirming the need for a multilevel modeling approach for both datasets, the results reveal the inadequacy of the standard parametric assumption in the varying intercept model for the municipality-level dataset. In fact, our proposed method is shown to improve model fitting significantly for the latter data. In a fully probabilistic framework, we also identify the expected number of latent clusters that share similar unidentified features among Canadian provinces and municipalities. It is possible thus to further investigate the reasons for such similarities and dissimilarities. This can have important policy implications for various safety management programs.

Dirichlet process mixture models, Finite mixture models, Latent subpopulations, Random effects models, Spatial/regional multilevel models, Unobserved heterogeneity
2213-6657
27-43
Heydari, Shahram
0d12a583-a4e8-4888-9e51-a50d312be1e9
Fu, Liping
5a8cfcc4-d76e-4456-b4e0-7877de2a0eb1
Lord, Dominique
968f94d4-a988-4bb2-a3f2-613c487a3f3a
Mallick, Bani K.
95794104-bc97-46a9-88ef-ec582c9cde24
Heydari, Shahram
0d12a583-a4e8-4888-9e51-a50d312be1e9
Fu, Liping
5a8cfcc4-d76e-4456-b4e0-7877de2a0eb1
Lord, Dominique
968f94d4-a988-4bb2-a3f2-613c487a3f3a
Mallick, Bani K.
95794104-bc97-46a9-88ef-ec582c9cde24

Heydari, Shahram, Fu, Liping, Lord, Dominique and Mallick, Bani K. (2016) Multilevel Dirichlet process mixture analysis of railway grade crossing crash data. Analytic Methods in Accident Research, 9, 27-43. (doi:10.1016/j.amar.2016.02.001).

Record type: Article

Abstract

This article introduces a flexible Bayesian semiparametric approach to analyzing crash data that are of hierarchical or multilevel nature. We extend the traditional varying intercept (random effects) multilevel model by relaxing its standard parametric distributional assumption. While accounting for unobserved cross-group heterogeneity in the data through intercept, the proposed method allows identifying latent subpopulations (and consequently outliers) in data based on a Dirichlet process mixture. It also allows estimating the number of latent subpopulations using an elegant mathematical structure instead of prespecifying this number arbitrarily as in conventional latent class or finite mixture models. In this paper, we evaluate our method on two recent railway grade crossing crash datasets, at province and municipality levels, from Canada for the years 2008-2013. We use cross-validation predictive densities and pseudo-Bayes factor for Bayesian model selection. While confirming the need for a multilevel modeling approach for both datasets, the results reveal the inadequacy of the standard parametric assumption in the varying intercept model for the municipality-level dataset. In fact, our proposed method is shown to improve model fitting significantly for the latter data. In a fully probabilistic framework, we also identify the expected number of latent clusters that share similar unidentified features among Canadian provinces and municipalities. It is possible thus to further investigate the reasons for such similarities and dissimilarities. This can have important policy implications for various safety management programs.

This record has no associated files available for download.

More information

Accepted/In Press date: 5 February 2016
e-pub ahead of print date: 27 February 2016
Published date: 1 March 2016
Keywords: Dirichlet process mixture models, Finite mixture models, Latent subpopulations, Random effects models, Spatial/regional multilevel models, Unobserved heterogeneity

Identifiers

Local EPrints ID: 424169
URI: http://eprints.soton.ac.uk/id/eprint/424169
ISSN: 2213-6657
PURE UUID: 7d68c7d8-40d7-44ff-90dd-12f63187221e

Catalogue record

Date deposited: 05 Oct 2018 11:31
Last modified: 17 Mar 2024 12:11

Export record

Altmetrics

Contributors

Author: Shahram Heydari
Author: Liping Fu
Author: Dominique Lord
Author: Bani K. Mallick

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×