Online machine learning for combinatorial data
Online machine learning for combinatorial data
With an ever increasing demand on large scale data, difficulties exist in terms of processing and utilising the information available. In particular, making decisions based upon sequentially acquired data where only limited information is initially known, is an important problem. Often the input data in such problems have a complex combinatorial structure, for example consider an internet advertising system that manages advertisement placement over a network of websites. The ways of placing m different advertisements on n websites with replacement, is an exponential number of mn possible combinations that scales badly with large n. As a combinatorial problem, the data can be manipulated within a frequently occurring computational object called graph, allowing the structure to be exploited for intelligent automatic processing. Traditionally, machine learning techniques require a separate initial training phase before predictions can occur on unseen data. However, the sequential nature of some problems necessitate real-time prediction, thereby making many existing techniques unsuitable. Online learning is a field of machine learning that has an ensemble of algorithms that learn from sequential streaming data, where the learner cannot control or in influence the data collection procedure. Although these existing online methods have theoretical guarantees on performance, in the context of combinatorial complexity of graphical structures they are not yet fully matured. In this thesis, a series of algorithms that attempt to overcome the shortcomings of existing online algorithms are presented. The discrete graphical model, called the Ising model, is explored to develop online approximation algorithms for label prediction. A deterministic approximation algorithm with sequential guarantee is developed, by capturing the persistent structures of maximum flows and minimum cuts in the network and an efficient enumeration of all label consistent minimum cuts. Novel mistake bounds are provided that improve and match previous performance bounds in the literature. Additionally, a variational approximation technique using mean field approximation is built for online prediction of multi-class labelling on the Ising model. An online sequential action selection algorithm for the limited feedback setting (bandit feedback) and side information is developed with a linear programming relaxation of the classic maximal flow problem. Finally, the multiple objective optimization problem with conflicting objectives and full feedback is studied and an online algorithm is built that outperforms the traditional approaches under similar assumptions.
University of Southampton
Ghosh, Shaona
b6567624-3b1f-40c2-9de7-fd44536a94a9
March 2016
Ghosh, Shaona
b6567624-3b1f-40c2-9de7-fd44536a94a9
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Ghosh, Shaona
(2016)
Online machine learning for combinatorial data.
University of Southampton, Doctoral Thesis, 138pp.
Record type:
Thesis
(Doctoral)
Abstract
With an ever increasing demand on large scale data, difficulties exist in terms of processing and utilising the information available. In particular, making decisions based upon sequentially acquired data where only limited information is initially known, is an important problem. Often the input data in such problems have a complex combinatorial structure, for example consider an internet advertising system that manages advertisement placement over a network of websites. The ways of placing m different advertisements on n websites with replacement, is an exponential number of mn possible combinations that scales badly with large n. As a combinatorial problem, the data can be manipulated within a frequently occurring computational object called graph, allowing the structure to be exploited for intelligent automatic processing. Traditionally, machine learning techniques require a separate initial training phase before predictions can occur on unseen data. However, the sequential nature of some problems necessitate real-time prediction, thereby making many existing techniques unsuitable. Online learning is a field of machine learning that has an ensemble of algorithms that learn from sequential streaming data, where the learner cannot control or in influence the data collection procedure. Although these existing online methods have theoretical guarantees on performance, in the context of combinatorial complexity of graphical structures they are not yet fully matured. In this thesis, a series of algorithms that attempt to overcome the shortcomings of existing online algorithms are presented. The discrete graphical model, called the Ising model, is explored to develop online approximation algorithms for label prediction. A deterministic approximation algorithm with sequential guarantee is developed, by capturing the persistent structures of maximum flows and minimum cuts in the network and an efficient enumeration of all label consistent minimum cuts. Novel mistake bounds are provided that improve and match previous performance bounds in the literature. Additionally, a variational approximation technique using mean field approximation is built for online prediction of multi-class labelling on the Ising model. An online sequential action selection algorithm for the limited feedback setting (bandit feedback) and side information is developed with a linear programming relaxation of the classic maximal flow problem. Finally, the multiple objective optimization problem with conflicting objectives and full feedback is studied and an online algorithm is built that outperforms the traditional approaches under similar assumptions.
Text
ShaonaGhosh_PhDThesis_main
- Version of Record
More information
Published date: March 2016
Identifiers
Local EPrints ID: 420649
URI: http://eprints.soton.ac.uk/id/eprint/420649
PURE UUID: 752f90e9-9602-407d-ad39-570b7c882b80
Catalogue record
Date deposited: 11 May 2018 16:30
Last modified: 15 Mar 2024 19:48
Export record
Contributors
Author:
Shaona Ghosh
Thesis advisor:
Adam Prugel-Bennett
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics