Towards pareto descent directions in sampling experts for multiple tasks in an on-line learning paradigm

In many real-life design problems, there is a requirement to simultaneously balance multiple tasks or objectives in the system that are conflicting in nature, where minimizing one objective causes another to increase in value, thereby resulting in trade-offs between the objectives. For example, in embedded multi-core mobile devices and very large scale data centers, there is a continuous problem of simultaneously balancing interfering goals of maximal power savings and minimal performance delay with varying trade-off values for different application workloads executing on them. Typically, the optimal trade-offs for the executing workloads, lie on a difficult to determine optimal Pareto front. The nature of the problem requires learning over the lifetime of the mobile device or server with continuous evaluation and prediction of the trade-off settings on the system that balances the interfering objectives optimally. Towards this, we propose an on-line learning method, where the weights of experts for addressing the objectives are updated based on a convex combination of their relative performance in addressing all objectives simultaneously. An additional importance vector that assigns relative importance to each objective at every round is used, and is sampled from a convex cone pointed at the origin Our preliminary results show that the convex combination of the importance vector and the gradient of the potential functions of the learner's regret with respect to each objective ensure that in the next round, the drift (instantaneous regret vector), is the Pareto descent direction that enables better convergence to the optimal Pareto front.

SS-13-05

AAAI Press

Ghosh, Shaona

b6567624-3b1f-40c2-9de7-fd44536a94a9

Lovell, Christopher James

e894d207-d2e7-4bb3-b39d-ea62f204140c

Gunn, Steve R.

306af9b3-a7fa-4381-baf9-5d6a6ec89868

25 March 2013