Understanding latent affective bias in large pre-trained neural language models
Understanding latent affective bias in large pre-trained neural language models
Groundbreaking inventions and highly significant performance improvements in deep learning based Natural Language Processing are witnessed through the development of transformer based large Pre-trained Language Models (PLMs). The wide availability of unlabeled data within human generated data deluge along with self-supervised learning strategy helps to accelerate the success of large PLMs in language generation, language understanding, etc. But at the same time, latent historical bias/unfairness in human minds towards a particular gender, race, etc., encoded unintentionally/intentionally into the corpora harms and questions the utility and efficacy of large PLMs in many real-world applications, particularly for the protected groups. In this paper, we present an extensive investigation towards understanding the existence of “Affective Bias” in large PLMs to unveil any biased association of emotions such as anger, fear, joy, etc., towards a particular gender, race or religion with respect to the downstream task of textual emotion detection. We conduct our exploration of affective bias from the very initial stage of corpus level affective bias analysis by searching for imbalanced distribution of affective words within a domain, in large scale corpora that are used to pre-train and fine-tune PLMs. Later, to quantify affective bias in model predictions, we perform an extensive set of class-based and intensity-based evaluations using various bias evaluation corpora. Our results show the existence of statistically significant affective bias in the PLM based emotion detection systems, indicating biased association of certain emotions towards a particular gender, race, and religion.
Affective bias in NLP, Fairness in NLP, Pretrained language models, Textual emotion detection, Deep learning
Kadan, Anoop
9cc17e26-a329-49fe-b73b-2fce75084966
P, Deepak
1c9443bb-3a4f-49c7-b42f-b978ae5b2a11
Bhadra, Sahely
4d58db35-9644-4837-86ab-940577dc2dea
P. Gangan, Manjary
f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b
V. L., Lajish
e7f39205-51be-4d69-8fc1-4c7b3feddef7
10 March 2024
Kadan, Anoop
9cc17e26-a329-49fe-b73b-2fce75084966
P, Deepak
1c9443bb-3a4f-49c7-b42f-b978ae5b2a11
Bhadra, Sahely
4d58db35-9644-4837-86ab-940577dc2dea
P. Gangan, Manjary
f1f79b4a-2662-4f0c-ad33-dbb0cbf2512b
V. L., Lajish
e7f39205-51be-4d69-8fc1-4c7b3feddef7
Kadan, Anoop, P, Deepak, Bhadra, Sahely, P. Gangan, Manjary and V. L., Lajish
(2024)
Understanding latent affective bias in large pre-trained neural language models.
Natural Language Processing Journal, 7, [100062].
(doi:10.1016/j.nlp.2024.100062).
Abstract
Groundbreaking inventions and highly significant performance improvements in deep learning based Natural Language Processing are witnessed through the development of transformer based large Pre-trained Language Models (PLMs). The wide availability of unlabeled data within human generated data deluge along with self-supervised learning strategy helps to accelerate the success of large PLMs in language generation, language understanding, etc. But at the same time, latent historical bias/unfairness in human minds towards a particular gender, race, etc., encoded unintentionally/intentionally into the corpora harms and questions the utility and efficacy of large PLMs in many real-world applications, particularly for the protected groups. In this paper, we present an extensive investigation towards understanding the existence of “Affective Bias” in large PLMs to unveil any biased association of emotions such as anger, fear, joy, etc., towards a particular gender, race or religion with respect to the downstream task of textual emotion detection. We conduct our exploration of affective bias from the very initial stage of corpus level affective bias analysis by searching for imbalanced distribution of affective words within a domain, in large scale corpora that are used to pre-train and fine-tune PLMs. Later, to quantify affective bias in model predictions, we perform an extensive set of class-based and intensity-based evaluations using various bias evaluation corpora. Our results show the existence of statistically significant affective bias in the PLM based emotion detection systems, indicating biased association of certain emotions towards a particular gender, race, and religion.
Text
1-s2.0-S2949719124000104-main
- Version of Record
More information
Accepted/In Press date: 1 March 2024
e-pub ahead of print date: 5 March 2024
Published date: 10 March 2024
Keywords:
Affective bias in NLP, Fairness in NLP, Pretrained language models, Textual emotion detection, Deep learning
Identifiers
Local EPrints ID: 493921
URI: http://eprints.soton.ac.uk/id/eprint/493921
ISSN: 2949-7191
PURE UUID: 93af72c8-9fab-4dd5-8aa3-ef33a4610474
Catalogue record
Date deposited: 17 Sep 2024 16:58
Last modified: 31 Oct 2024 03:15
Export record
Altmetrics
Contributors
Author:
Anoop Kadan
Author:
Deepak P
Author:
Sahely Bhadra
Author:
Manjary P. Gangan
Author:
Lajish V. L.
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics