User profiling using machine learning
User profiling using machine learning
The goal of the Instant Knowledge project was to design a system to facilitate the sharing of knowledge and expertise within a distributed mobile environment. This system automatically builds profiles of experts interests, and automatically recommends them based on context and social networking information. This thesis describes my contributions to the IK project which involves profiling users and making recommendations using machine learning techniques.
Recommender systems are information filtering systems which recommend items to users based on a model of their preferences. Recommenders suffer from a number of problems: they do not make use of contextual information, so recommendations may be untimely or inappropriate; they often use a centralised architecture, which makes it difficult to react to the changing needs of users; they are often implemented in an ad-hoc fashion making it difficult to make principled improvements or add extra information.
In this thesis I present a probabilistic recommender based on Bayes' theorem. Rating behaviour is modelled using a Bayesian prior to improve performance in conditions of data sparsity. The best results are obtained using a Gaussian model for user ratings, and a Gaussian-gamma model for co-rating behaviour. The use of a probabilistic framework should make it easier to add context information to the recommendation process.
Generating profiles automatically carries the risk of accidentally including private information which may be discovered by querying the Instant Knowledge system. This presents a privacy risk, as private information may be accidentally incorporated into experts' profiles. I present a framework for evaluating the effect of contamination on performance, and the ability of filtering techniques to preserve privacy. Several filtering techniques are tested and I show that supervised and semi-supervised naive Bayes classifiers can help to preserve privacy.
Barnard, Thomas Charles
2c5b5212-060d-48e7-a2a5-8f0aa3822988
August 2012
Barnard, Thomas Charles
2c5b5212-060d-48e7-a2a5-8f0aa3822988
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Barnard, Thomas Charles
(2012)
User profiling using machine learning.
University of Southampton, Faculty of Physical and Applied Sciences, Doctoral Thesis, 145pp.
Record type:
Thesis
(Doctoral)
Abstract
The goal of the Instant Knowledge project was to design a system to facilitate the sharing of knowledge and expertise within a distributed mobile environment. This system automatically builds profiles of experts interests, and automatically recommends them based on context and social networking information. This thesis describes my contributions to the IK project which involves profiling users and making recommendations using machine learning techniques.
Recommender systems are information filtering systems which recommend items to users based on a model of their preferences. Recommenders suffer from a number of problems: they do not make use of contextual information, so recommendations may be untimely or inappropriate; they often use a centralised architecture, which makes it difficult to react to the changing needs of users; they are often implemented in an ad-hoc fashion making it difficult to make principled improvements or add extra information.
In this thesis I present a probabilistic recommender based on Bayes' theorem. Rating behaviour is modelled using a Bayesian prior to improve performance in conditions of data sparsity. The best results are obtained using a Gaussian model for user ratings, and a Gaussian-gamma model for co-rating behaviour. The use of a probabilistic framework should make it easier to add context information to the recommendation process.
Generating profiles automatically carries the risk of accidentally including private information which may be discovered by querying the Instant Knowledge system. This presents a privacy risk, as private information may be accidentally incorporated into experts' profiles. I present a framework for evaluating the effect of contamination on performance, and the ability of filtering techniques to preserve privacy. Several filtering techniques are tested and I show that supervised and semi-supervised naive Bayes classifiers can help to preserve privacy.
Text
BarnardThomasThesis.pdf
- Other
More information
Published date: August 2012
Organisations:
University of Southampton, Southampton Wireless Group
Identifiers
Local EPrints ID: 344922
URI: http://eprints.soton.ac.uk/id/eprint/344922
PURE UUID: ceafb61d-7b61-4099-a0b7-4befbf6b9a2a
Catalogue record
Date deposited: 25 Feb 2013 14:15
Last modified: 14 Mar 2024 12:20
Export record
Contributors
Author:
Thomas Charles Barnard
Thesis advisor:
Adam Prugel-Bennett
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics