The University of Southampton
University of Southampton Institutional Repository

Strategyproof reinforcement learning for online resource allocation

Strategyproof reinforcement learning for online resource allocation
Strategyproof reinforcement learning for online resource allocation
We consider an online resource allocation problem where tasks with specific values, sizes and resource requirements arrive dynamically over time, and have to be either serviced or rejected immediately. Reinforcement learning is a promising approach for this, but existing work on reinforcement learning has neglected that task owners may misreport their task requirements or values strategically when this is to their benefit. To address this, we apply mechanism design and propose a novel mechanism based on reinforcement learning that aims to maximise social welfare, is strategyproof and individually rational (i.e., truthful reporting and participation are incentivised). In experiments, we show that our algorithm achieves results that are typically within 90% of the optimal social welfare, while outperforming approaches that use fixed pricing (by up to 86% in specific cases).
1296–1304
University of Auckland
Stein, Sebastian
cb2325e7-5e63-475e-8a69-9db2dfbdb00b
Ochal, Mateusz
d9ab415d-8f23-49f3-9eef-5f0f891249e4
Moisoiu, Ioana-Adriana
247980df-0336-49b8-97d1-0b1895d504f5
Gerding, Enrico
d9e92ee5-1a8c-4467-a689-8363e7743362
Ganti, Raghu
2a43a38b-8bad-466a-b877-9b55a4c2bc80
He, Ting
3e347968-3aec-4015-8ddd-8cc28e7b1276
La Porta, Tom
338bb4e0-022e-467b-92b0-5876e26d5442
Stein, Sebastian
cb2325e7-5e63-475e-8a69-9db2dfbdb00b
Ochal, Mateusz
d9ab415d-8f23-49f3-9eef-5f0f891249e4
Moisoiu, Ioana-Adriana
247980df-0336-49b8-97d1-0b1895d504f5
Gerding, Enrico
d9e92ee5-1a8c-4467-a689-8363e7743362
Ganti, Raghu
2a43a38b-8bad-466a-b877-9b55a4c2bc80
He, Ting
3e347968-3aec-4015-8ddd-8cc28e7b1276
La Porta, Tom
338bb4e0-022e-467b-92b0-5876e26d5442

Stein, Sebastian, Ochal, Mateusz, Moisoiu, Ioana-Adriana, Gerding, Enrico, Ganti, Raghu, He, Ting and La Porta, Tom (2020) Strategyproof reinforcement learning for online resource allocation. In AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems. University of Auckland. 1296–1304 . (doi:10.5555/3398761.3398911).

Record type: Conference or Workshop Item (Paper)

Abstract

We consider an online resource allocation problem where tasks with specific values, sizes and resource requirements arrive dynamically over time, and have to be either serviced or rejected immediately. Reinforcement learning is a promising approach for this, but existing work on reinforcement learning has neglected that task owners may misreport their task requirements or values strategically when this is to their benefit. To address this, we apply mechanism design and propose a novel mechanism based on reinforcement learning that aims to maximise social welfare, is strategyproof and individually rational (i.e., truthful reporting and participation are incentivised). In experiments, we show that our algorithm achieves results that are typically within 90% of the optimal social welfare, while outperforming approaches that use fixed pricing (by up to 86% in specific cases).

Text
srl-aamas-main - Accepted Manuscript
Download (816kB)

More information

Accepted/In Press date: 15 January 2020
Published date: 9 May 2020
Venue - Dates: 19th International Conference on Autonomous Agents and Multiagent Systems, New Zealand, 2020-05-09 - 2020-05-13

Identifiers

Local EPrints ID: 438382
URI: http://eprints.soton.ac.uk/id/eprint/438382
PURE UUID: 22f6276f-d428-47cd-8e17-7348796ae76f
ORCID for Enrico Gerding: ORCID iD orcid.org/0000-0001-7200-552X

Catalogue record

Date deposited: 09 Mar 2020 17:30
Last modified: 06 Aug 2020 01:36

Export record

Altmetrics

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×