Pareto-optimal multi-agent cooperative caching relying on multi-policy reinforcement learning

Given the popularity of flawless telepresence and the resultants explosive growth of wireless video applications, besides handling the traffic surge, satisfying the demanding user requirements for video qualities has become another important goal of network operators. Inspired by this, cooperative edge caching intrinsically amalgamated with scalable video coding is investigated. Explicitly, the concept of a Pareto-optimal semidistributed multi-agent multi-policy deep reinforcement learning (SD-MAMP-DRL) algorithm is conceived for managing the cooperation of heterogeneous network nodes. To elaborate, a multi-policy reinforcement learning algorithm is proposed for finding the Pareto-optimal policies during the training phase, which balances the tele-traffic vs. the user experience trade-off. Then the optimal policy/solution can be activated during the execution phase by appropriately selecting the associated weighting coefficient according to the dynamically fluctuating network traffic load. Our experimental results show that the proposed SD-MAMP-DRL algorithm 1) achieves better performance than the benchmark algorithms; 2) obtains a near-complete Paretofront in various scenarios and selects the optimal solution by adaptively adjusting the above-mentioned pair of objectives.

Cooperative caching, Costs, Edge caching, Pareto-front, Quality of experience, Reinforcement learning, Servers, Training, Wireless communication, multi-agent reinforcement learning, multi-objective optimization, scalable video coding

10.1109/JIOT.2023.3317971

2327-4662

Guo, Boyang

47547be3-a6fe-4646-a3f0-bebb7271ca02

Chen, Youjia

e20460e7-386e-488c-b00d-6be16202eab6

Cheng, Peng

fb8e4c84-f337-406c-8e7d-85f101faad4b

Ding, Ming

4f284c12-a6ec-41cd-8e89-4ae573066452

Hu, Jinsong

074aef33-05a1-4526-bed7-a97526839e62

Hanzo, Lajos

66e7266f-3066-4fc0-8391-e000acce71a1

Guo, Boyang

47547be3-a6fe-4646-a3f0-bebb7271ca02

Chen, Youjia

e20460e7-386e-488c-b00d-6be16202eab6

Cheng, Peng

fb8e4c84-f337-406c-8e7d-85f101faad4b

Ding, Ming

4f284c12-a6ec-41cd-8e89-4ae573066452

Hu, Jinsong

074aef33-05a1-4526-bed7-a97526839e62

Hanzo, Lajos

66e7266f-3066-4fc0-8391-e000acce71a1

Guo, Boyang, Chen, Youjia, Cheng, Peng, Ding, Ming, Hu, Jinsong and Hanzo, Lajos (2023) Pareto-optimal multi-agent cooperative caching relying on multi-policy reinforcement learning. IEEE Internet of Things Journal, 1. (doi:10.1109/JIOT.2023.3317971).

Record type: Article

Abstract

Text

MORL - Accepted Manuscript

Available under License Creative Commons Attribution.

Download (3MB)

Text

MORL - Accepted Manuscript

Download (3MB)

More information

Accepted/In Press date: 14 September 2023

e-pub ahead of print date: 22 September 2023

Additional Information: Publisher Copyright: IEEE

Keywords: Cooperative caching, Costs, Edge caching, Pareto-front, Quality of experience, Reinforcement learning, Servers, Training, Wireless communication, multi-agent reinforcement learning, multi-objective optimization, scalable video coding

Learn more about Next Generation Wireless research