The University of Southampton
University of Southampton Institutional Repository

Semantic Models for Machine Learning

Hardoon, David R (2006) Semantic Models for Machine Learning University of Southampton, School of Electornics and Computer Science, Doctoral Thesis .

Record type: Thesis (Doctoral)


In this thesis we present approaches to the creation and usage of semantic models by the analysis of the data spread in the feature space. We aim to introduce the general notion of using feature selection techniques in machine learning applications. The applied approaches obtain new feature directions on data, such that machine learning applications would show an increase in performance. We review three principle methods that are used throughout the thesis. Firstly Canonical Correlation Analysis (CCA), which is a method of correlating linear relationships between two multidimensional variables. CCA can be seen as using complex labels as a way of guiding feature selection towards the underlying semantics. CCA makes use of two views of the same semantic object to extract a representation of the semantics. Secondly Partial Least Squares (PLS), a method similar to CCA. It selects feature directions that are useful for the task at hand, though PLS only uses one view of an object and the label as the corresponding pair. PLS could be thought of as a method that looks for directions that are good for distinguishing the different labels. The third method is the Fisher kernel. A method that aims to extract more information of a generative model than simply by their output probabilities. The aim is to analyse how the Fisher score depends on the model and which aspects of the model are important in determining the Fisher score. We focus our theoretical investigation primarily on CCA and its kernel variant. Providing a theoretical analysis of the method's stability using Rademacher complexity, hence deriving the error bound for new data. We conclude the thesis by applying the described approaches to problems in the various fields of image, text, music application and medical analysis, describing several novel applications on relevant real-world data. The aim of the thesis is to provide a theoretical understanding of semantic models, while also providing a good application foundation on how these models can be practically used.

PDF Hardoon_Thesis.pdf - Other
Download (5MB)

More information

Published date: February 2006
Keywords: Kernel Methods, SVM, Neural Networks, Bayesian Networks, Fisher Kernels, CCA, KCCA, PLS, KPLS, Gram-Schmidt, fMRI, Content-Based Retrieval, Music Score Generation, Composer and Performer Identification, Medical Analysis
Organisations: University of Southampton, Electronics & Computer Science


Local EPrints ID: 262019
PURE UUID: f967313b-bf4a-48a3-933c-dbb714203a92

Catalogue record

Date deposited: 22 Feb 2006
Last modified: 18 Jul 2017 08:56

Export record


Author: David R Hardoon

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton:

ePrints Soton supports OAI 2.0 with a base URL of

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.