Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features
Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features
The development of world wide web with easy access to massive information sources anywhere and anytime paves way for more people to rely on online news media rather than print media. The scenario expedites rapid growth of online news industries and leads to substantial competitive pressure. In this work, we propose a set of hybrid features for online news popularity prediction before publication. Two categories of features extracted from news articles, the first being conventional features comprising metadata, temporal, contextual, and embedding vector features, and the second being enhanced features comprising readability, emotion, and psycholinguistics features are extracted from the articles. Apart from analyzing the effectiveness of conventional and enhanced features, we combine these features to come up with a set of hybrid features. We curate an Indian news dataset consisting of news articles from the most rated Indian news websites for the study and also contribute the dataset for future research. Evaluations are performed over the Indian news dataset (IND) and compared with the performance over the benchmark mashable dataset using various supervised machine learning models. Our results indicate that the proposed hybrid of enhanced features with conventional features are highly effective for online news popularity prediction before publication.
Emotion features, News popularity prediction, Online news media, Psycholinguistics features, Readability features
539 -545
Rajagopal, Suharshala
0b281bba-0ab3-43fc-89c2-cf67920cf11d
Kadan, Anoop
9cc17e26-a329-49fe-b73b-2fce75084966
P. Gangan, Manjary
f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b
V. L., Lajish
e7f39205-51be-4d69-8fc1-4c7b3feddef7
Rajagopal, Suharshala
0b281bba-0ab3-43fc-89c2-cf67920cf11d
Kadan, Anoop
9cc17e26-a329-49fe-b73b-2fce75084966
P. Gangan, Manjary
f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b
V. L., Lajish
e7f39205-51be-4d69-8fc1-4c7b3feddef7
Rajagopal, Suharshala, Kadan, Anoop, P. Gangan, Manjary and V. L., Lajish
(2022)
Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features.
IAES International Journal of Artificial Intelligence (IJ - AI ), 11 (2), .
(doi:10.11591/ijai.v11.i2.pp539-545).
Abstract
The development of world wide web with easy access to massive information sources anywhere and anytime paves way for more people to rely on online news media rather than print media. The scenario expedites rapid growth of online news industries and leads to substantial competitive pressure. In this work, we propose a set of hybrid features for online news popularity prediction before publication. Two categories of features extracted from news articles, the first being conventional features comprising metadata, temporal, contextual, and embedding vector features, and the second being enhanced features comprising readability, emotion, and psycholinguistics features are extracted from the articles. Apart from analyzing the effectiveness of conventional and enhanced features, we combine these features to come up with a set of hybrid features. We curate an Indian news dataset consisting of news articles from the most rated Indian news websites for the study and also contribute the dataset for future research. Evaluations are performed over the Indian news dataset (IND) and compared with the performance over the benchmark mashable dataset using various supervised machine learning models. Our results indicate that the proposed hybrid of enhanced features with conventional features are highly effective for online news popularity prediction before publication.
Text
21095-41119-1-PB
- Version of Record
More information
Accepted/In Press date: 20 January 2022
e-pub ahead of print date: 1 June 2022
Keywords:
Emotion features, News popularity prediction, Online news media, Psycholinguistics features, Readability features
Identifiers
Local EPrints ID: 494237
URI: http://eprints.soton.ac.uk/id/eprint/494237
ISSN: 2252-8938
PURE UUID: 0fe55e0c-b139-48f4-a9ab-9bd640f27a49
Catalogue record
Date deposited: 01 Oct 2024 16:51
Last modified: 31 Oct 2024 03:15
Export record
Altmetrics
Contributors
Author:
Suharshala Rajagopal
Author:
Anoop Kadan
Author:
Manjary P. Gangan
Author:
Lajish V. L.
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics