Forecasting financial markets with online information
Forecasting financial markets with online information
This thesis explores the relationship between what investors say on online social media and price movements in financial markets. Recent studies have applied techniques from the natural language processing literature to distil the content of blogs, micro-blogs and user generated content sites to a ‘sentiment’ measure, pertaining to whether the content is good or bad for a given stock. Sentiment is then measured over a time series and compared to stock price returns. There is general agreement in the literature that a relationship between online sentiment and returns exists, but the strength, sign and timing of this relationship vary across studies.
In this thesis I argue that this type of sign-strength-timing variability is an inherent part of the sentiment-price relationship. My rationale for this is that existing sentiment metrics miss important contextual information that can significantly alter the interpretation of a piece of text. The fact that sentiment measures lack this type of contextual awareness means that the relationship between sentiment and price will vary based on factors that are latent from the sentiment measure.
Based on this argument I make three key contributions in this thesis: firstly, I document significant evidence that sign, strength and timing variability are a characteristic feature of the online textual sentiment-price relationship. Secondly, I develop a novel time series analysis methodology, signal diffusion mapping (SDM), that is capable of modelling and forecasting effectively based on relationships that are characterised by this type of variability. Third, I show that when appropriately modelled using SDM, it is possible to use the sentiment signal to forecast prices. Using this methodology I document that the sentiment-price relationship is much stronger than has previously been assumed in the literature. I go on to show it is possible to develop trading strategies based on SDM that generate excess returns once reasonable costs have been accounted for.
I conclude that there is economically meaningful financial information in online social media, and that a characteristic of this information with respect to prices is variability. Modelling variability more accurately using SDM opens the possibility for using online information directly in asset pricing models or trading strategies.
University of Southampton
Gaskell, Paul
9d855bff-86a2-4995-965a-3cbd968c5b8a
September 2015
Gaskell, Paul
9d855bff-86a2-4995-965a-3cbd968c5b8a
Mcgroarty, Frank
693a5396-8e01-4d68-8973-d74184c03072
Gaskell, Paul
(2015)
Forecasting financial markets with online information.
University of Southampton, Faculty of Business, Law and Art, Doctoral Thesis, 167pp.
Record type:
Thesis
(Doctoral)
Abstract
This thesis explores the relationship between what investors say on online social media and price movements in financial markets. Recent studies have applied techniques from the natural language processing literature to distil the content of blogs, micro-blogs and user generated content sites to a ‘sentiment’ measure, pertaining to whether the content is good or bad for a given stock. Sentiment is then measured over a time series and compared to stock price returns. There is general agreement in the literature that a relationship between online sentiment and returns exists, but the strength, sign and timing of this relationship vary across studies.
In this thesis I argue that this type of sign-strength-timing variability is an inherent part of the sentiment-price relationship. My rationale for this is that existing sentiment metrics miss important contextual information that can significantly alter the interpretation of a piece of text. The fact that sentiment measures lack this type of contextual awareness means that the relationship between sentiment and price will vary based on factors that are latent from the sentiment measure.
Based on this argument I make three key contributions in this thesis: firstly, I document significant evidence that sign, strength and timing variability are a characteristic feature of the online textual sentiment-price relationship. Secondly, I develop a novel time series analysis methodology, signal diffusion mapping (SDM), that is capable of modelling and forecasting effectively based on relationships that are characterised by this type of variability. Third, I show that when appropriately modelled using SDM, it is possible to use the sentiment signal to forecast prices. Using this methodology I document that the sentiment-price relationship is much stronger than has previously been assumed in the literature. I go on to show it is possible to develop trading strategies based on SDM that generate excess returns once reasonable costs have been accounted for.
I conclude that there is economically meaningful financial information in online social media, and that a characteristic of this information with respect to prices is variability. Modelling variability more accurately using SDM opens the possibility for using online information directly in asset pricing models or trading strategies.
Text
Final PhD thesis - Paul Gaskell.pdf
- Version of Record
More information
Published date: September 2015
Organisations:
University of Southampton, Southampton Business School
Identifiers
Local EPrints ID: 393779
URI: http://eprints.soton.ac.uk/id/eprint/393779
PURE UUID: 639bcf5f-4f57-4155-aad6-5d1b0cf01bf4
Catalogue record
Date deposited: 05 Jul 2016 15:26
Last modified: 15 Mar 2024 03:17
Export record
Contributors
Author:
Paul Gaskell
Thesis advisor:
Frank Mcgroarty
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics