Extracting mobile behavioral patterns with the distant N-Gram topic model
Extracting mobile behavioral patterns with the distant N-Gram topic model
Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistently outperforms LDA as the sequence length increases.
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Gatica-Perez, Daniel
583e99b0-abef-4d2a-b54f-70ab7b498975
June 2012
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Gatica-Perez, Daniel
583e99b0-abef-4d2a-b54f-70ab7b498975
Farrahi, Katayoun and Gatica-Perez, Daniel
(2012)
Extracting mobile behavioral patterns with the distant N-Gram topic model.
In 2012 16th International Symposium on Wearable Computers (ISWC).
IEEE.
8 pp
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistently outperforms LDA as the sequence length increases.
This record has no associated files available for download.
More information
Published date: June 2012
Identifiers
Local EPrints ID: 420661
URI: http://eprints.soton.ac.uk/id/eprint/420661
PURE UUID: 7ab38661-8186-4506-b335-d8a4b0c4366c
Catalogue record
Date deposited: 11 May 2018 16:30
Last modified: 16 Mar 2024 04:31
Export record
Contributors
Author:
Katayoun Farrahi
Author:
Daniel Gatica-Perez
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics