The University of Southampton
University of Southampton Institutional Repository

User profiling using machine learning

User profiling using machine learning
User profiling using machine learning
The goal of the Instant Knowledge project was to design a system to facilitate the sharing of knowledge and expertise within a distributed mobile environment. This system automatically builds profiles of experts interests, and automatically recommends them based on context and social networking information. This thesis describes my contributions to the IK project which involves profiling users and making recommendations using machine learning techniques.

Recommender systems are information filtering systems which recommend items to users based on a model of their preferences. Recommenders suffer from a number of problems: they do not make use of contextual information, so recommendations may be untimely or inappropriate; they often use a centralised architecture, which makes it difficult to react to the changing needs of users; they are often implemented in an ad-hoc fashion making it difficult to make principled improvements or add extra information.

In this thesis I present a probabilistic recommender based on Bayes' theorem. Rating behaviour is modelled using a Bayesian prior to improve performance in conditions of data sparsity. The best results are obtained using a Gaussian model for user ratings, and a Gaussian-gamma model for co-rating behaviour. The use of a probabilistic framework should make it easier to add context information to the recommendation process.

Generating profiles automatically carries the risk of accidentally including private information which may be discovered by querying the Instant Knowledge system. This presents a privacy risk, as private information may be accidentally incorporated into experts' profiles. I present a framework for evaluating the effect of contamination on performance, and the ability of filtering techniques to preserve privacy. Several filtering techniques are tested and I show that supervised and semi-supervised naive Bayes classifiers can help to preserve privacy.
Barnard, Thomas Charles
2c5b5212-060d-48e7-a2a5-8f0aa3822988
Barnard, Thomas Charles
2c5b5212-060d-48e7-a2a5-8f0aa3822988
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e

Barnard, Thomas Charles (2012) User profiling using machine learning. University of Southampton, Faculty of Physical and Applied Sciences, Doctoral Thesis, 145pp.

Record type: Thesis (Doctoral)

Abstract

The goal of the Instant Knowledge project was to design a system to facilitate the sharing of knowledge and expertise within a distributed mobile environment. This system automatically builds profiles of experts interests, and automatically recommends them based on context and social networking information. This thesis describes my contributions to the IK project which involves profiling users and making recommendations using machine learning techniques.

Recommender systems are information filtering systems which recommend items to users based on a model of their preferences. Recommenders suffer from a number of problems: they do not make use of contextual information, so recommendations may be untimely or inappropriate; they often use a centralised architecture, which makes it difficult to react to the changing needs of users; they are often implemented in an ad-hoc fashion making it difficult to make principled improvements or add extra information.

In this thesis I present a probabilistic recommender based on Bayes' theorem. Rating behaviour is modelled using a Bayesian prior to improve performance in conditions of data sparsity. The best results are obtained using a Gaussian model for user ratings, and a Gaussian-gamma model for co-rating behaviour. The use of a probabilistic framework should make it easier to add context information to the recommendation process.

Generating profiles automatically carries the risk of accidentally including private information which may be discovered by querying the Instant Knowledge system. This presents a privacy risk, as private information may be accidentally incorporated into experts' profiles. I present a framework for evaluating the effect of contamination on performance, and the ability of filtering techniques to preserve privacy. Several filtering techniques are tested and I show that supervised and semi-supervised naive Bayes classifiers can help to preserve privacy.

Text
BarnardThomasThesis.pdf - Other
Download (3MB)

More information

Published date: August 2012
Organisations: University of Southampton, Southampton Wireless Group

Identifiers

Local EPrints ID: 344922
URI: http://eprints.soton.ac.uk/id/eprint/344922
PURE UUID: ceafb61d-7b61-4099-a0b7-4befbf6b9a2a

Catalogue record

Date deposited: 25 Feb 2013 14:15
Last modified: 14 Mar 2024 12:20

Export record

Contributors

Author: Thomas Charles Barnard
Thesis advisor: Adam Prugel-Bennett

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×