Cultivating Desired Behaviour: Policy Teaching Via Environment-Dynamics Tweaks


Rabinovich, Zinovi, Dufton, Lachlan, Larson, Kate and Jennings, Nick (2010) Cultivating Desired Behaviour: Policy Teaching Via Environment-Dynamics Tweaks. In, The 9th International Conference on Autonomous Agents and Multiagent Systems, Toronto, Canada, , 1097-1104.

Download

[img] PDF - Published Version
Download (157Kb)

Description/Abstract

In this paper we study, for the first time explicitly, the implications of endowing an interested party (i.e. a teacher) with the ability to modify the underlying dynamics of the environment, in order to encourage an agent to learn to follow a specific policy. We introduce a cost function which can be used by the teacher to balance the modifications it makes to the underlying environment dynamics, with the learner's performance compared to some ideal, desired, policy. We formulate teacher's problem of determining optimal environment changes as a planning and control problem, and empirically validate the effectiveness of our model.

Item Type: Conference or Workshop Item (Paper)
Keywords: Teacher-learner, control theory, Kullback-Leibler Rate
Divisions: Faculty of Physical Sciences and Engineering > Electronics and Computer Science > Agents, Interactions & Complexity
ePrint ID: 268470
Date Deposited: 05 Feb 2010 13:09
Last Modified: 27 Mar 2014 20:15
Further Information:Google Scholar
URI: http://eprints.soton.ac.uk/id/eprint/268470

Actions (login required)

View Item View Item

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics