The University of Southampton
University of Southampton Institutional Repository

Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features

Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features
Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features
The development of world wide web with easy access to massive information sources anywhere and anytime paves way for more people to rely on online news media rather than print media. The scenario expedites rapid growth of online news industries and leads to substantial competitive pressure. In this work, we propose a set of hybrid features for online news popularity prediction before publication. Two categories of features extracted from news articles, the first being conventional features comprising metadata, temporal, contextual, and embedding vector features, and the second being enhanced features comprising readability, emotion, and psycholinguistics features are extracted from the articles. Apart from analyzing the effectiveness of conventional and enhanced features, we combine these features to come up with a set of hybrid features. We curate an Indian news dataset consisting of news articles from the most rated Indian news websites for the study and also contribute the dataset for future research. Evaluations are performed over the Indian news dataset (IND) and compared with the performance over the benchmark mashable dataset using various supervised machine learning models. Our results indicate that the proposed hybrid of enhanced features with conventional features are highly effective for online news popularity prediction before publication.
Emotion features, News popularity prediction, Online news media, Psycholinguistics features, Readability features
2252-8938
539 -545
Rajagopal, Suharshala
0b281bba-0ab3-43fc-89c2-cf67920cf11d
Kadan, Anoop
9cc17e26-a329-49fe-b73b-2fce75084966
P. Gangan, Manjary
f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b
V. L., Lajish
e7f39205-51be-4d69-8fc1-4c7b3feddef7
Rajagopal, Suharshala
0b281bba-0ab3-43fc-89c2-cf67920cf11d
Kadan, Anoop
9cc17e26-a329-49fe-b73b-2fce75084966
P. Gangan, Manjary
f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b
V. L., Lajish
e7f39205-51be-4d69-8fc1-4c7b3feddef7

Rajagopal, Suharshala, Kadan, Anoop, P. Gangan, Manjary and V. L., Lajish (2022) Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features. IAES International Journal of Artificial Intelligence (IJ - AI ), 11 (2), 539 -545. (doi:10.11591/ijai.v11.i2.pp539-545).

Record type: Article

Abstract

The development of world wide web with easy access to massive information sources anywhere and anytime paves way for more people to rely on online news media rather than print media. The scenario expedites rapid growth of online news industries and leads to substantial competitive pressure. In this work, we propose a set of hybrid features for online news popularity prediction before publication. Two categories of features extracted from news articles, the first being conventional features comprising metadata, temporal, contextual, and embedding vector features, and the second being enhanced features comprising readability, emotion, and psycholinguistics features are extracted from the articles. Apart from analyzing the effectiveness of conventional and enhanced features, we combine these features to come up with a set of hybrid features. We curate an Indian news dataset consisting of news articles from the most rated Indian news websites for the study and also contribute the dataset for future research. Evaluations are performed over the Indian news dataset (IND) and compared with the performance over the benchmark mashable dataset using various supervised machine learning models. Our results indicate that the proposed hybrid of enhanced features with conventional features are highly effective for online news popularity prediction before publication.

Text
21095-41119-1-PB - Version of Record
Available under License Creative Commons Attribution.
Download (298kB)

More information

Accepted/In Press date: 20 January 2022
e-pub ahead of print date: 1 June 2022
Keywords: Emotion features, News popularity prediction, Online news media, Psycholinguistics features, Readability features

Identifiers

Local EPrints ID: 494237
URI: http://eprints.soton.ac.uk/id/eprint/494237
ISSN: 2252-8938
PURE UUID: 0fe55e0c-b139-48f4-a9ab-9bd640f27a49
ORCID for Anoop Kadan: ORCID iD orcid.org/0000-0002-4335-5544

Catalogue record

Date deposited: 01 Oct 2024 16:51
Last modified: 31 Oct 2024 03:15

Export record

Altmetrics

Contributors

Author: Suharshala Rajagopal
Author: Anoop Kadan ORCID iD
Author: Manjary P. Gangan
Author: Lajish V. L.

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×