The University of Southampton
University of Southampton Institutional Repository

Social Reward Shaping in the Prisoner's Dilemma

Babes, Monica, Munoz de Cote, Enrique and Littman, Michael L. (2008) Social Reward Shaping in the Prisoner's Dilemma At Proceedings of the International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS).

Record type: Conference or Workshop Item (Poster)


Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to near-optimal behavior. In this paper, we introduce \emph{social reward shaping}, which is reward shaping applied in the multiagent-learning framework. We present preliminary experiments in the iterated Prisoner's dilemma setting that show that agents using social reward shaping appropriately can behave more effectively than other classical learning and non-learning strategies. In particular, we show that these agents can both lead ---encourage adaptive opponents to stably cooperate--- and follow ---adopt a best-response strategy when paired with a fixed opponent--- where better known approaches achieve only one of these objectives.

PDF BML08AAMAS.pdf - Version of Record
Download (116kB)

More information

Published date: 2008
Venue - Dates: Proceedings of the International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS), 2008-01-01
Keywords: Reinforcement Learning, leader/follower strategies, iterated prisoner's dilemma, game theory, subgame perfect equilibrium
Organisations: Electronics & Computer Science


Local EPrints ID: 266919
PURE UUID: 5b83322b-57a2-4794-9444-59a3857930e0

Catalogue record

Date deposited: 17 Nov 2008 14:36
Last modified: 18 Jul 2017 07:10

Export record


Author: Monica Babes
Author: Enrique Munoz de Cote
Author: Michael L. Littman

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton:

ePrints Soton supports OAI 2.0 with a base URL of

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.