The University of Southampton
University of Southampton Institutional Repository

Towards potential-based learning for pareto trade-offs in on-line prediction with experts

Towards potential-based learning for pareto trade-offs in on-line prediction with experts
Towards potential-based learning for pareto trade-offs in on-line prediction with experts
The on-line learning with experts paradigm is a well-established machine learning framework where the role of the decision maker is to predict as good as the best expert in hindsight. In our work, we apply on-line learning to embedded mobile devices where there is a continuous problem of learning conflicting trade-offs: the device should save as much power as possible for battery longevity while executing applications with minimum performance delay. These trade-offs are conflicting in nature; additional power while executing applications can only be saved at the cost of performance of the applications or vice versa. An example of a power policy in such devices is one that reduces the voltage or frequency of the cpu/core to decrease its speed and thus saving power while making the application take longer time to execute. Such trade-offs are also known as Pareto trade-offs. The on-line learning theory with experts does not include support for multiple objective learning where the learner is evaluated on how well it performs while predicting the Pareto trade-offs. Typically, for devising a general strategy for the learner, this framework uses a potential function as a function of the learner’s cumulative regret over time. We analyse this potential function in the context of our Pareto trade-off environment where the gradient of the potential function is calculated with respect to each trade-off separately. At every round, if the instantaneous regret vector or expert selected by the learner lies in the negative half space of the gradients of the potential function with respect to each trade-off, then the cumulative regret of the learner will simultaneously decrease for both trade-offs. However, this is only possible when the expert selected is far from the local Pareto optimal point and is nicely aligned with the gradients of the potential function with respect to each trade-off. At the point of local Pareto optimality, the gradients face in opposite directions and then minimization of the learner’s regret is only possible by compromising one trade-off for the other. We consider Pareto descent direction in which no other expert is superior in improving the regret of the learner for all trade-offs simultaneously and none of the Pareto descent experts are superior to each other. The Pareto descent direction turns out to be a convex combination of steepest descent directions of the potential functions with respect to each trade-off. We represent this convex combination as separate weight vector which enables combination of the trade-offs in a weighted manner; one trade-off is weighed more than the other, each weight represents vertices of a convex cone pointed at the origin. This weighted vector enables selection of an expert that reduces the learner’s regret only with respect to one trade-off if the weight vector is non-zero. As our future work, we wish to derive generalization bounds for our learning algorithm. We also hope to analyse Blackwell’s approachability condition and Hannan consistency in the context of our modified learner’s strategy.
Ghosh, Shaona
b6567624-3b1f-40c2-9de7-fd44536a94a9
Gunn, Steve
306af9b3-a7fa-4381-baf9-5d6a6ec89868
Ghosh, Shaona
b6567624-3b1f-40c2-9de7-fd44536a94a9
Gunn, Steve
306af9b3-a7fa-4381-baf9-5d6a6ec89868

Ghosh, Shaona and Gunn, Steve (2012) Towards potential-based learning for pareto trade-offs in on-line prediction with experts. Women in Machine Learning 2012, Lake Tahoe, Nevada, United States. 03 Dec 2012.

Record type: Conference or Workshop Item (Poster)

Abstract

The on-line learning with experts paradigm is a well-established machine learning framework where the role of the decision maker is to predict as good as the best expert in hindsight. In our work, we apply on-line learning to embedded mobile devices where there is a continuous problem of learning conflicting trade-offs: the device should save as much power as possible for battery longevity while executing applications with minimum performance delay. These trade-offs are conflicting in nature; additional power while executing applications can only be saved at the cost of performance of the applications or vice versa. An example of a power policy in such devices is one that reduces the voltage or frequency of the cpu/core to decrease its speed and thus saving power while making the application take longer time to execute. Such trade-offs are also known as Pareto trade-offs. The on-line learning theory with experts does not include support for multiple objective learning where the learner is evaluated on how well it performs while predicting the Pareto trade-offs. Typically, for devising a general strategy for the learner, this framework uses a potential function as a function of the learner’s cumulative regret over time. We analyse this potential function in the context of our Pareto trade-off environment where the gradient of the potential function is calculated with respect to each trade-off separately. At every round, if the instantaneous regret vector or expert selected by the learner lies in the negative half space of the gradients of the potential function with respect to each trade-off, then the cumulative regret of the learner will simultaneously decrease for both trade-offs. However, this is only possible when the expert selected is far from the local Pareto optimal point and is nicely aligned with the gradients of the potential function with respect to each trade-off. At the point of local Pareto optimality, the gradients face in opposite directions and then minimization of the learner’s regret is only possible by compromising one trade-off for the other. We consider Pareto descent direction in which no other expert is superior in improving the regret of the learner for all trade-offs simultaneously and none of the Pareto descent experts are superior to each other. The Pareto descent direction turns out to be a convex combination of steepest descent directions of the potential functions with respect to each trade-off. We represent this convex combination as separate weight vector which enables combination of the trade-offs in a weighted manner; one trade-off is weighed more than the other, each weight represents vertices of a convex cone pointed at the origin. This weighted vector enables selection of an expert that reduces the learner’s regret only with respect to one trade-off if the weight vector is non-zero. As our future work, we wish to derive generalization bounds for our learning algorithm. We also hope to analyse Blackwell’s approachability condition and Hannan consistency in the context of our modified learner’s strategy.

This record has no associated files available for download.

More information

e-pub ahead of print date: 3 December 2012
Venue - Dates: Women in Machine Learning 2012, Lake Tahoe, Nevada, United States, 2012-12-03 - 2012-12-03
Related URLs:
Organisations: Electronic & Software Systems

Identifiers

Local EPrints ID: 346943
URI: http://eprints.soton.ac.uk/id/eprint/346943
PURE UUID: 537e929a-7cf4-4f2c-a243-614dfa065438

Catalogue record

Date deposited: 08 Apr 2013 09:32
Last modified: 11 Dec 2021 01:21

Export record

Contributors

Author: Shaona Ghosh
Author: Steve Gunn

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×