Understanding latent affective bias in large pre-trained neural language models

Groundbreaking inventions and highly significant performance improvements in deep learning based Natural Language Processing are witnessed through the development of transformer based large Pre-trained Language Models (PLMs). The wide availability of unlabeled data within human generated data deluge along with self-supervised learning strategy helps to accelerate the success of large PLMs in language generation, language understanding, etc. But at the same time, latent historical bias/unfairness in human minds towards a particular gender, race, etc., encoded unintentionally/intentionally into the corpora harms and questions the utility and efficacy of large PLMs in many real-world applications, particularly for the protected groups. In this paper, we present an extensive investigation towards understanding the existence of “Affective Bias” in large PLMs to unveil any biased association of emotions such as anger, fear, joy, etc., towards a particular gender, race or religion with respect to the downstream task of textual emotion detection. We conduct our exploration of affective bias from the very initial stage of corpus level affective bias analysis by searching for imbalanced distribution of affective words within a domain, in large scale corpora that are used to pre-train and fine-tune PLMs. Later, to quantify affective bias in model predictions, we perform an extensive set of class-based and intensity-based evaluations using various bias evaluation corpora. Our results show the existence of statistically significant affective bias in the PLM based emotion detection systems, indicating biased association of certain emotions towards a particular gender, race, and religion.

Affective bias in NLP, Fairness in NLP, Pretrained language models, Textual emotion detection, Deep learning

10.1016/j.nlp.2024.100062

2949-7191

Kadan, Anoop

9cc17e26-a329-49fe-b73b-2fce75084966

P, Deepak

1c9443bb-3a4f-49c7-b42f-b978ae5b2a11

Bhadra, Sahely

4d58db35-9644-4837-86ab-940577dc2dea

P. Gangan, Manjary

f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b

V. L., Lajish

e7f39205-51be-4d69-8fc1-4c7b3feddef7

10 March 2024

Kadan, Anoop

9cc17e26-a329-49fe-b73b-2fce75084966

P, Deepak

1c9443bb-3a4f-49c7-b42f-b978ae5b2a11

Bhadra, Sahely

4d58db35-9644-4837-86ab-940577dc2dea

P. Gangan, Manjary

f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b

V. L., Lajish

e7f39205-51be-4d69-8fc1-4c7b3feddef7

Kadan, Anoop, P, Deepak, Bhadra, Sahely, P. Gangan, Manjary and V. L., Lajish (2024) Understanding latent affective bias in large pre-trained neural language models. Natural Language Processing Journal, 7, [100062]. (doi:10.1016/j.nlp.2024.100062).

Record type: Article

Abstract

Text

1-s2.0-S2949719124000104-main - Version of Record

Available under License Creative Commons Attribution.

Download (1MB)

More information

Accepted/In Press date: 1 March 2024

e-pub ahead of print date: 5 March 2024

Published date: 10 March 2024

Keywords: Affective bias in NLP, Fairness in NLP, Pretrained language models, Textual emotion detection, Deep learning

Identifiers

Local EPrints ID: 493921

URI: http://eprints.soton.ac.uk/id/eprint/493921

DOI: doi:10.1016/j.nlp.2024.100062

ISSN: 2949-7191

PURE UUID: 93af72c8-9fab-4dd5-8aa3-ef33a4610474

ORCID for Anoop Kadan:

orcid.org/0000-0002-4335-5544

Catalogue record

Date deposited: 17 Sep 2024 16:58

Last modified: 31 Oct 2024 03:15

Export record

Altmetrics

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Anoop Kadan

Author: Deepak P

Author: Sahely Bhadra

Author: Manjary P. Gangan

Author: Lajish V. L.

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information