Playing repeated network interdiction games with semi-bandit feedback
Playing repeated network interdiction games with semi-bandit feedback
We study repeated network interdiction games with no prior knowledge of the adversary and the environment, which can model many real world network security domains. Existing works often require plenty of available information for the defender and neglect the frequent interactions between both players, which are unrealistic and impractical, and thus, are not suitable for our settings. As such, we provide the first defender strategy, that enjoys nice theoretical and practical performance guarantees, by applying the adversarial online learning approach. In particular, we model the repeated network interdiction game with no prior knowledge as an online linear optimization problem, for which a novel and efficient online learning algorithm, SBGA, is proposed, which exploits the unique semi-bandit feedback in network security domains. We prove that SBGA achieves sublinear regret against adaptive adversary, compared with both the best fixed strategy in hindsight and a near optimal adaptive strategy. Extensive experiments also show that SBGA significantly outperforms existing approaches with fast convergence rate.
Guo, Qingyu
9922ab2c-9e8f-484f-ae29-0455d5edc6b3
An, Bo
4b0743f9-91c9-4452-868c-1d12b4e9f456
Tran-Thanh, Long
e0666669-d34b-460e-950d-e8b139fab16c
Guo, Qingyu
9922ab2c-9e8f-484f-ae29-0455d5edc6b3
An, Bo
4b0743f9-91c9-4452-868c-1d12b4e9f456
Tran-Thanh, Long
e0666669-d34b-460e-950d-e8b139fab16c
Guo, Qingyu, An, Bo and Tran-Thanh, Long
(2017)
Playing repeated network interdiction games with semi-bandit feedback.
In Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17).
9 pp
.
(In Press)
Record type:
Conference or Workshop Item
(Paper)
Abstract
We study repeated network interdiction games with no prior knowledge of the adversary and the environment, which can model many real world network security domains. Existing works often require plenty of available information for the defender and neglect the frequent interactions between both players, which are unrealistic and impractical, and thus, are not suitable for our settings. As such, we provide the first defender strategy, that enjoys nice theoretical and practical performance guarantees, by applying the adversarial online learning approach. In particular, we model the repeated network interdiction game with no prior knowledge as an online linear optimization problem, for which a novel and efficient online learning algorithm, SBGA, is proposed, which exploits the unique semi-bandit feedback in network security domains. We prove that SBGA achieves sublinear regret against adaptive adversary, compared with both the best fixed strategy in hindsight and a near optimal adaptive strategy. Extensive experiments also show that SBGA significantly outperforms existing approaches with fast convergence rate.
Text
online learning in network flow interdiction game
- Accepted Manuscript
More information
Accepted/In Press date: 23 April 2017
Organisations:
Agents, Interactions & Complexity
Identifiers
Local EPrints ID: 411950
URI: http://eprints.soton.ac.uk/id/eprint/411950
PURE UUID: b81f5918-ba9a-43ac-972e-a5310e53aa00
Catalogue record
Date deposited: 03 Jul 2017 16:31
Last modified: 15 Mar 2024 15:05
Export record
Contributors
Author:
Qingyu Guo
Author:
Bo An
Author:
Long Tran-Thanh
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics