Reinforcement learning and the power law of practice: some analytical results
Ianni, A. (2002) Reinforcement learning and the power law of practice: some analytical results. Southampton, GB, University of Southampton, 36pp. (Discussion Papers in Economics and Econometrics, 0203).
Erev and Roth (1998) among others provide a comprehensive analysis of experimental evidence on learning in games, based on a stochastic model of learning that accounts for two main elements: the Law of Effect (positive reinforcement of actions that perform well) and the Power Law of Practice (learning curves tend to be steeper initially). This note complements this literature by providing an analytical study of the properties of such learning models. Specifically, the paper shows that:
(a) up to an error term, the stochastic process is driven by a system of discrete time difference equations of the replicator type. This carries an analogy with Börgers and Sarin (1997), where reinforcement learning accounts only for the Law of Effect.
(b) if the trajectories of the system of replicator equations converge sufficiently fast, then the probability that all realizations of the learning process over a possibly infinite spell of time lie within a given small distance of the solution path of the replicator dynamics becomes, from some time on, arbitrarily close to one. Fast convergence, in the form of exponential convergence, is shown to hold for any strict Nash equilibrium of the underlying game.
|Item Type:||Monograph (Discussion Paper)|
|Subjects:||H Social Sciences > HB Economic Theory
Q Science > QA Mathematics
|Divisions:||University Structure - Pre August 2011 > School of Social Sciences > Economics
|Date Deposited:||18 May 2006|
|Last Modified:||27 Mar 2014 18:20|
Reinforcement Learning: Analytical Results and Methodology for Estimation
Funded by: ESRC (R000223704)
January 2002 to December 2003
|Publisher:||University of Southampton|
|RDF:||RDF+N-Triples, RDF+N3, RDF+XML, Browse.|
Actions (login required)