The University of Southampton
University of Southampton Institutional Repository

How to recommend music to film buffs: enabling the provision of recommendations from multiple domains

How to recommend music to film buffs: enabling the provision of recommendations from multiple domains
How to recommend music to film buffs: enabling the provision of recommendations from multiple domains
In broad terms, Recommender Systems use machine learning techniques to process historical data about their user's interests, encoded in user profiles. Once the algorithms used have been trained on user profiles, their output is used to compile a ranked list of all resources available for recommendation, based on each profile.

Collaborative Filtering is the most widespread method of carrying this out, building on the intuition that similar people will be interested in the same things. The point of failure in this approach lies in that similarity can only be assessed between users that have expressed their preferences on a common set of resources. This requirement prohibits the sharing of preference data across different systems, and causes additional problems when new resources for recommendation become available, or when new users subscribe to the system.

I propose that the difficulty can be overcome by identifying and exploiting semantic relationships between the resources available for recommendation themselves. Moreover, systems that are able to assess the strength of the relationship between any two resources can provide recommendations from multiple domains. For example, music recommendations can be made based on a person's film taste if strong semantic relationships can be identified between certain films and the music he/she listens to.

As such the contributions made by this dissertation can be summarised in the following:

1. Facilitating the comparison of heterogeneous resources

The use of Wikipedia is proposed for this purpose, under the assumption that hyper-links between articles in Wikipedia convey latent semantic relationships between the concepts they describe. Thus, a methodology for projecting domain resources onto Wikipedia has been developed. The assumption is then validated by showing evidence that the projections are successful in retaining similarity between domain resources, in three independent domains.

2. Enabling the provision of recommendations from multiple domains

The aforementioned projections encode the links present in Wikipedia articles that are found to correspond to domain resources, and can be viewed collectively as a graph. In addition, the Internet is populated with social networks of people who express their preferences on a given set of resources in the form of ratings. Members of such communities are included as nodes in the graph and ratings regarding domain resources represented as edges. A reversible Markov chain model was implemented to describe the probabilities associated with the traversal of edges in the integrated graph. Nodes that represent resources and other concepts the user is known to be interested in are then identified in the graph. Using these nodes as a starting point, the resource nodes most likely to be reached after an arbitrarily large number of edge traversals are considered the most relevant to the user and are recommended. Experimental results show that the framework is successful in predicting user preferences in domains different to those of the input.
Loizou, Antonis
92e9c5aa-0dd2-4fec-8a61-42bc3ccf5308
Loizou, Antonis
92e9c5aa-0dd2-4fec-8a61-42bc3ccf5308
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
Lewis, Paul
7aa6c6d9-bc69-4e19-b2ac-a6e20558c020

Loizou, Antonis (2009) How to recommend music to film buffs: enabling the provision of recommendations from multiple domains. University of Southampton, School of Electronics and Computer Science, Doctoral Thesis, 141pp.

Record type: Thesis (Doctoral)

Abstract

In broad terms, Recommender Systems use machine learning techniques to process historical data about their user's interests, encoded in user profiles. Once the algorithms used have been trained on user profiles, their output is used to compile a ranked list of all resources available for recommendation, based on each profile.

Collaborative Filtering is the most widespread method of carrying this out, building on the intuition that similar people will be interested in the same things. The point of failure in this approach lies in that similarity can only be assessed between users that have expressed their preferences on a common set of resources. This requirement prohibits the sharing of preference data across different systems, and causes additional problems when new resources for recommendation become available, or when new users subscribe to the system.

I propose that the difficulty can be overcome by identifying and exploiting semantic relationships between the resources available for recommendation themselves. Moreover, systems that are able to assess the strength of the relationship between any two resources can provide recommendations from multiple domains. For example, music recommendations can be made based on a person's film taste if strong semantic relationships can be identified between certain films and the music he/she listens to.

As such the contributions made by this dissertation can be summarised in the following:

1. Facilitating the comparison of heterogeneous resources

The use of Wikipedia is proposed for this purpose, under the assumption that hyper-links between articles in Wikipedia convey latent semantic relationships between the concepts they describe. Thus, a methodology for projecting domain resources onto Wikipedia has been developed. The assumption is then validated by showing evidence that the projections are successful in retaining similarity between domain resources, in three independent domains.

2. Enabling the provision of recommendations from multiple domains

The aforementioned projections encode the links present in Wikipedia articles that are found to correspond to domain resources, and can be viewed collectively as a graph. In addition, the Internet is populated with social networks of people who express their preferences on a given set of resources in the form of ratings. Members of such communities are included as nodes in the graph and ratings regarding domain resources represented as edges. A reversible Markov chain model was implemented to describe the probabilities associated with the traversal of edges in the integrated graph. Nodes that represent resources and other concepts the user is known to be interested in are then identified in the graph. Using these nodes as a starting point, the resource nodes most likely to be reached after an arbitrarily large number of edge traversals are considered the most relevant to the user and are recommended. Experimental results show that the framework is successful in predicting user preferences in domains different to those of the input.

Text
al05r_thesis.pdf - Other
Download (2MB)

More information

Published date: May 2009
Organisations: University of Southampton

Identifiers

Local EPrints ID: 66281
URI: http://eprints.soton.ac.uk/id/eprint/66281
PURE UUID: ae93bb7a-30b4-47ef-94d8-18feecef5a18

Catalogue record

Date deposited: 28 May 2009
Last modified: 13 Mar 2024 18:16

Export record

Contributors

Author: Antonis Loizou
Thesis advisor: Srinandan Dasmahapatra
Thesis advisor: Paul Lewis

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×