Deep reinforcement learning for portfolio selection
Deep reinforcement learning for portfolio selection
This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.
Deep reinforcement learning, Portfolio constraint, Portfolio risk awareness, Portfolio trading, Transaction cost
Jiang, Yifu
cff1ccf8-1299-45de-95ec-f449f30fa0b8
Olmo, Jose
706f68c8-f991-4959-8245-6657a591056e
Atwi, Majed
a713c2fd-6b12-412d-9065-8a72ae788ad7
16 July 2024
Jiang, Yifu
cff1ccf8-1299-45de-95ec-f449f30fa0b8
Olmo, Jose
706f68c8-f991-4959-8245-6657a591056e
Atwi, Majed
a713c2fd-6b12-412d-9065-8a72ae788ad7
Jiang, Yifu, Olmo, Jose and Atwi, Majed
(2024)
Deep reinforcement learning for portfolio selection.
Global Finance Journal, 62, [101016].
(doi:10.1016/j.gfj.2024.101016).
Abstract
This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.
Text
1-s2.0-S1044028324000887-main
- Version of Record
More information
Accepted/In Press date: 11 July 2024
e-pub ahead of print date: 14 July 2024
Published date: 16 July 2024
Keywords:
Deep reinforcement learning, Portfolio constraint, Portfolio risk awareness, Portfolio trading, Transaction cost
Identifiers
Local EPrints ID: 495610
URI: http://eprints.soton.ac.uk/id/eprint/495610
ISSN: 1044-0283
PURE UUID: 629d116f-1dbd-4213-8efe-323bfcf1c29e
Catalogue record
Date deposited: 19 Nov 2024 17:41
Last modified: 20 Nov 2024 02:45
Export record
Altmetrics
Contributors
Author:
Yifu Jiang
Author:
Majed Atwi
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics