The University of Southampton
University of Southampton Institutional Repository

Reinforcement learning and the power law of practice: some analytical results

Reinforcement learning and the power law of practice: some analytical results
Reinforcement learning and the power law of practice: some analytical results
Erev and Roth (1998) among others provide a comprehensive analysis of experimental evidence on learning in games, based on a stochastic model of learning that accounts for two main elements: the Law of Effect (positive reinforcement of actions that perform well) and the Power Law of Practice (learning curves tend to be steeper initially). This note complements this literature by providing an analytical study of the properties of such learning models. Specifically, the paper shows that:

(a) up to an error term, the stochastic process is driven by a system of discrete time difference equations of the replicator type. This carries an analogy with Börgers and Sarin (1997), where reinforcement learning accounts only for the Law of Effect.

(b) if the trajectories of the system of replicator equations converge sufficiently fast, then the probability that all realizations of the learning process over a possibly infinite spell of time lie within a given small distance of the solution path of the replicator dynamics becomes, from some time on, arbitrarily close to one. Fast convergence, in the form of exponential convergence, is shown to hold for any strict Nash equilibrium of the underlying game.
203
University of Southampton
Ianni, A.
35024f65-34cd-4e20-9b2a-554600d739f3
Ianni, A.
35024f65-34cd-4e20-9b2a-554600d739f3

Ianni, A. (2002) Reinforcement learning and the power law of practice: some analytical results (Discussion Papers in Economics and Econometrics, 203) Southampton, GB. University of Southampton 36pp.

Record type: Monograph (Discussion Paper)

Abstract

Erev and Roth (1998) among others provide a comprehensive analysis of experimental evidence on learning in games, based on a stochastic model of learning that accounts for two main elements: the Law of Effect (positive reinforcement of actions that perform well) and the Power Law of Practice (learning curves tend to be steeper initially). This note complements this literature by providing an analytical study of the properties of such learning models. Specifically, the paper shows that:

(a) up to an error term, the stochastic process is driven by a system of discrete time difference equations of the replicator type. This carries an analogy with Börgers and Sarin (1997), where reinforcement learning accounts only for the Law of Effect.

(b) if the trajectories of the system of replicator equations converge sufficiently fast, then the probability that all realizations of the learning process over a possibly infinite spell of time lie within a given small distance of the solution path of the replicator dynamics becomes, from some time on, arbitrarily close to one. Fast convergence, in the form of exponential convergence, is shown to hold for any strict Nash equilibrium of the underlying game.

Text
0203.pdf - Other
Download (428kB)

More information

Published date: 2002

Identifiers

Local EPrints ID: 33094
URI: http://eprints.soton.ac.uk/id/eprint/33094
PURE UUID: 38b1dc4d-4419-44e7-b0a4-b7710cdb2212
ORCID for A. Ianni: ORCID iD orcid.org/0000-0002-5003-4482

Catalogue record

Date deposited: 18 May 2006
Last modified: 16 Mar 2024 02:51

Export record

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×