The University of Southampton
University of Southampton Institutional Repository

Online Learning against Strategic Adversary

Online Learning against Strategic Adversary
Online Learning against Strategic Adversary
Our work considers repeated games in which one player has a different objective than others. In particular, we investigate repeated two-player zero-sum games where the column player not only aims to minimize her regret but also stabilize the actions. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her regret. We develop a no-dynamic regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms the row player can use, including the multiplicative weights update algorithm, general follow-theregularized-leader and any no-regret algorithms satisfy a property so called “stability”. Our algorithm can be applied to the game setting where the column player is also a designer of the system, and has full control over payoff matrices.
1841-1842
International Foundation for Autonomous Agents and Multiagent System
Dinh, Le Cong
e89b4443-9eff-4790-b101-9eabe5ef947c
Dinh, Le Cong
e89b4443-9eff-4790-b101-9eabe5ef947c

Dinh, Le Cong (2022) Online Learning against Strategic Adversary. In International Conference on Autonomous Agents and Multiagent Systems. International Foundation for Autonomous Agents and Multiagent System. pp. 1841-1842 .

Record type: Conference or Workshop Item (Paper)

Abstract

Our work considers repeated games in which one player has a different objective than others. In particular, we investigate repeated two-player zero-sum games where the column player not only aims to minimize her regret but also stabilize the actions. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her regret. We develop a no-dynamic regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms the row player can use, including the multiplicative weights update algorithm, general follow-theregularized-leader and any no-regret algorithms satisfy a property so called “stability”. Our algorithm can be applied to the game setting where the column player is also a designer of the system, and has full control over payoff matrices.

This record has no associated files available for download.

More information

Published date: 9 May 2022
Venue - Dates: 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2022, , Auckland, Virtual, New Zealand, 2022-05-09 - 2022-05-13

Identifiers

Local EPrints ID: 471007
URI: http://eprints.soton.ac.uk/id/eprint/471007
PURE UUID: 5cf42b5b-4389-4460-88c1-58d3c5704125

Catalogue record

Date deposited: 24 Oct 2022 16:40
Last modified: 24 Oct 2022 17:01

Export record

Contributors

Author: Le Cong Dinh

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×