The University of Southampton
University of Southampton Institutional Repository

Pareto-optimal multi-agent cooperative caching relying on multi-policy reinforcement learning

Pareto-optimal multi-agent cooperative caching relying on multi-policy reinforcement learning
Pareto-optimal multi-agent cooperative caching relying on multi-policy reinforcement learning
Given the popularity of flawless telepresence and the resultants explosive growth of wireless video applications, besides handling the traffic surge, satisfying the demanding user requirements for video qualities has become another important goal of network operators. Inspired by this, cooperative edge caching intrinsically amalgamated with scalable video coding is investigated. Explicitly, the concept of a Pareto-optimal semidistributed multi-agent multi-policy deep reinforcement learning (SD-MAMP-DRL) algorithm is conceived for managing the cooperation of heterogeneous network nodes. To elaborate, a multi-policy reinforcement learning algorithm is proposed for finding the Pareto-optimal policies during the training phase, which balances the tele-traffic vs. the user experience trade-off. Then the optimal policy/solution can be activated during the execution phase by appropriately selecting the associated weighting coefficient according to the dynamically fluctuating network traffic load. Our experimental results show that the proposed SD-MAMP-DRL algorithm 1) achieves better performance than the benchmark algorithms; 2) obtains a near-complete Paretofront in various scenarios and selects the optimal solution by adaptively adjusting the above-mentioned pair of objectives.
Cooperative caching, Costs, Edge caching, Pareto-front, Quality of experience, Reinforcement learning, Servers, Training, Wireless communication, multi-agent reinforcement learning, multi-objective optimization, scalable video coding
2327-4662
1
Guo, Boyang
47547be3-a6fe-4646-a3f0-bebb7271ca02
Chen, Youjia
e20460e7-386e-488c-b00d-6be16202eab6
Cheng, Peng
fb8e4c84-f337-406c-8e7d-85f101faad4b
Ding, Ming
4f284c12-a6ec-41cd-8e89-4ae573066452
Hu, Jinsong
074aef33-05a1-4526-bed7-a97526839e62
Hanzo, Lajos
66e7266f-3066-4fc0-8391-e000acce71a1
Guo, Boyang
47547be3-a6fe-4646-a3f0-bebb7271ca02
Chen, Youjia
e20460e7-386e-488c-b00d-6be16202eab6
Cheng, Peng
fb8e4c84-f337-406c-8e7d-85f101faad4b
Ding, Ming
4f284c12-a6ec-41cd-8e89-4ae573066452
Hu, Jinsong
074aef33-05a1-4526-bed7-a97526839e62
Hanzo, Lajos
66e7266f-3066-4fc0-8391-e000acce71a1

Guo, Boyang, Chen, Youjia, Cheng, Peng, Ding, Ming, Hu, Jinsong and Hanzo, Lajos (2023) Pareto-optimal multi-agent cooperative caching relying on multi-policy reinforcement learning. IEEE Internet of Things Journal, 1. (doi:10.1109/JIOT.2023.3317971).

Record type: Article

Abstract

Given the popularity of flawless telepresence and the resultants explosive growth of wireless video applications, besides handling the traffic surge, satisfying the demanding user requirements for video qualities has become another important goal of network operators. Inspired by this, cooperative edge caching intrinsically amalgamated with scalable video coding is investigated. Explicitly, the concept of a Pareto-optimal semidistributed multi-agent multi-policy deep reinforcement learning (SD-MAMP-DRL) algorithm is conceived for managing the cooperation of heterogeneous network nodes. To elaborate, a multi-policy reinforcement learning algorithm is proposed for finding the Pareto-optimal policies during the training phase, which balances the tele-traffic vs. the user experience trade-off. Then the optimal policy/solution can be activated during the execution phase by appropriately selecting the associated weighting coefficient according to the dynamically fluctuating network traffic load. Our experimental results show that the proposed SD-MAMP-DRL algorithm 1) achieves better performance than the benchmark algorithms; 2) obtains a near-complete Paretofront in various scenarios and selects the optimal solution by adaptively adjusting the above-mentioned pair of objectives.

Text
MORL - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (3MB)
Text
MORL - Accepted Manuscript
Download (3MB)

More information

Accepted/In Press date: 14 September 2023
e-pub ahead of print date: 22 September 2023
Additional Information: Publisher Copyright: IEEE
Keywords: Cooperative caching, Costs, Edge caching, Pareto-front, Quality of experience, Reinforcement learning, Servers, Training, Wireless communication, multi-agent reinforcement learning, multi-objective optimization, scalable video coding

Identifiers

Local EPrints ID: 482348
URI: http://eprints.soton.ac.uk/id/eprint/482348
ISSN: 2327-4662
PURE UUID: 87262ba7-9c5b-4d99-9907-5aaead314aab
ORCID for Lajos Hanzo: ORCID iD orcid.org/0000-0002-2636-5214

Catalogue record

Date deposited: 27 Sep 2023 16:41
Last modified: 18 Mar 2024 02:36

Export record

Altmetrics

Contributors

Author: Boyang Guo
Author: Youjia Chen
Author: Peng Cheng
Author: Ming Ding
Author: Jinsong Hu
Author: Lajos Hanzo ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×