The University of Southampton
University of Southampton Institutional Repository

Online machine learning for combinatorial data

Online machine learning for combinatorial data
Online machine learning for combinatorial data
With an ever increasing demand on large scale data, difficulties exist in terms of processing and utilising the information available. In particular, making decisions based upon sequentially acquired data where only limited information is initially known, is an important problem. Often the input data in such problems have a complex combinatorial structure, for example consider an internet advertising system that manages advertisement placement over a network of websites. The ways of placing m different advertisements on n websites with replacement, is an exponential number of mn possible combinations that scales badly with large n. As a combinatorial problem, the data can be manipulated within a frequently occurring computational object called graph, allowing the structure to be exploited for intelligent automatic processing. Traditionally, machine learning techniques require a separate initial training phase before predictions can occur on unseen data. However, the sequential nature of some problems necessitate real-time prediction, thereby making many existing techniques unsuitable. Online learning is a field of machine learning that has an ensemble of algorithms that learn from sequential streaming data, where the learner cannot control or in influence the data collection procedure. Although these existing online methods have theoretical guarantees on performance, in the context of combinatorial complexity of graphical structures they are not yet fully matured. In this thesis, a series of algorithms that attempt to overcome the shortcomings of existing online algorithms are presented. The discrete graphical model, called the Ising model, is explored to develop online approximation algorithms for label prediction. A deterministic approximation algorithm with sequential guarantee is developed, by capturing the persistent structures of maximum flows and minimum cuts in the network and an efficient enumeration of all label consistent minimum cuts. Novel mistake bounds are provided that improve and match previous performance bounds in the literature. Additionally, a variational approximation technique using mean field approximation is built for online prediction of multi-class labelling on the Ising model. An online sequential action selection algorithm for the limited feedback setting (bandit feedback) and side information is developed with a linear programming relaxation of the classic maximal flow problem. Finally, the multiple objective optimization problem with conflicting objectives and full feedback is studied and an online algorithm is built that outperforms the traditional approaches under similar assumptions.
University of Southampton
Ghosh, Shaona
b6567624-3b1f-40c2-9de7-fd44536a94a9
Ghosh, Shaona
b6567624-3b1f-40c2-9de7-fd44536a94a9
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e

Ghosh, Shaona (2016) Online machine learning for combinatorial data. University of Southampton, Doctoral Thesis, 138pp.

Record type: Thesis (Doctoral)

Abstract

With an ever increasing demand on large scale data, difficulties exist in terms of processing and utilising the information available. In particular, making decisions based upon sequentially acquired data where only limited information is initially known, is an important problem. Often the input data in such problems have a complex combinatorial structure, for example consider an internet advertising system that manages advertisement placement over a network of websites. The ways of placing m different advertisements on n websites with replacement, is an exponential number of mn possible combinations that scales badly with large n. As a combinatorial problem, the data can be manipulated within a frequently occurring computational object called graph, allowing the structure to be exploited for intelligent automatic processing. Traditionally, machine learning techniques require a separate initial training phase before predictions can occur on unseen data. However, the sequential nature of some problems necessitate real-time prediction, thereby making many existing techniques unsuitable. Online learning is a field of machine learning that has an ensemble of algorithms that learn from sequential streaming data, where the learner cannot control or in influence the data collection procedure. Although these existing online methods have theoretical guarantees on performance, in the context of combinatorial complexity of graphical structures they are not yet fully matured. In this thesis, a series of algorithms that attempt to overcome the shortcomings of existing online algorithms are presented. The discrete graphical model, called the Ising model, is explored to develop online approximation algorithms for label prediction. A deterministic approximation algorithm with sequential guarantee is developed, by capturing the persistent structures of maximum flows and minimum cuts in the network and an efficient enumeration of all label consistent minimum cuts. Novel mistake bounds are provided that improve and match previous performance bounds in the literature. Additionally, a variational approximation technique using mean field approximation is built for online prediction of multi-class labelling on the Ising model. An online sequential action selection algorithm for the limited feedback setting (bandit feedback) and side information is developed with a linear programming relaxation of the classic maximal flow problem. Finally, the multiple objective optimization problem with conflicting objectives and full feedback is studied and an online algorithm is built that outperforms the traditional approaches under similar assumptions.

Text
ShaonaGhosh_PhDThesis_main - Version of Record
Available under License University of Southampton Thesis Licence.
Download (6MB)

More information

Published date: March 2016

Identifiers

Local EPrints ID: 420649
URI: http://eprints.soton.ac.uk/id/eprint/420649
PURE UUID: 752f90e9-9602-407d-ad39-570b7c882b80

Catalogue record

Date deposited: 11 May 2018 16:30
Last modified: 15 Mar 2024 19:48

Export record

Contributors

Author: Shaona Ghosh
Thesis advisor: Adam Prugel-Bennett

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×