The University of Southampton
University of Southampton Institutional Repository

Extracting mobile behavioral patterns with the distant N-Gram topic model

Extracting mobile behavioral patterns with the distant N-Gram topic model
Extracting mobile behavioral patterns with the distant N-Gram topic model
Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistently outperforms LDA as the sequence length increases.
IEEE
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Gatica-Perez, Daniel
583e99b0-abef-4d2a-b54f-70ab7b498975
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Gatica-Perez, Daniel
583e99b0-abef-4d2a-b54f-70ab7b498975

Farrahi, Katayoun and Gatica-Perez, Daniel (2012) Extracting mobile behavioral patterns with the distant N-Gram topic model. In 2012 16th International Symposium on Wearable Computers (ISWC). IEEE. 8 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistently outperforms LDA as the sequence length increases.

This record has no associated files available for download.

More information

Published date: June 2012

Identifiers

Local EPrints ID: 420661
URI: http://eprints.soton.ac.uk/id/eprint/420661
PURE UUID: 7ab38661-8186-4506-b335-d8a4b0c4366c
ORCID for Katayoun Farrahi: ORCID iD orcid.org/0000-0001-6775-127X

Catalogue record

Date deposited: 11 May 2018 16:30
Last modified: 16 Mar 2024 04:31

Export record

Contributors

Author: Katayoun Farrahi ORCID iD
Author: Daniel Gatica-Perez

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×