The University of Southampton
University of Southampton Institutional Repository

Deep reinforcement learning for portfolio selection

Deep reinforcement learning for portfolio selection
Deep reinforcement learning for portfolio selection

This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.

Deep reinforcement learning, Portfolio constraint, Portfolio risk awareness, Portfolio trading, Transaction cost
1044-0283
Jiang, Yifu
cff1ccf8-1299-45de-95ec-f449f30fa0b8
Olmo, Jose
706f68c8-f991-4959-8245-6657a591056e
Atwi, Majed
a713c2fd-6b12-412d-9065-8a72ae788ad7
Jiang, Yifu
cff1ccf8-1299-45de-95ec-f449f30fa0b8
Olmo, Jose
706f68c8-f991-4959-8245-6657a591056e
Atwi, Majed
a713c2fd-6b12-412d-9065-8a72ae788ad7

Jiang, Yifu, Olmo, Jose and Atwi, Majed (2024) Deep reinforcement learning for portfolio selection. Global Finance Journal, 62, [101016]. (doi:10.1016/j.gfj.2024.101016).

Record type: Article

Abstract

This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.

Text
1-s2.0-S1044028324000887-main - Version of Record
Download (6MB)

More information

Accepted/In Press date: 11 July 2024
e-pub ahead of print date: 14 July 2024
Published date: 16 July 2024
Keywords: Deep reinforcement learning, Portfolio constraint, Portfolio risk awareness, Portfolio trading, Transaction cost

Identifiers

Local EPrints ID: 495610
URI: http://eprints.soton.ac.uk/id/eprint/495610
ISSN: 1044-0283
PURE UUID: 629d116f-1dbd-4213-8efe-323bfcf1c29e
ORCID for Jose Olmo: ORCID iD orcid.org/0000-0002-0437-7812

Catalogue record

Date deposited: 19 Nov 2024 17:41
Last modified: 20 Nov 2024 02:45

Export record

Altmetrics

Contributors

Author: Yifu Jiang
Author: Jose Olmo ORCID iD
Author: Majed Atwi

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×