Identifying social group constructions in political online discussions: ambiguity, subjectivity, and label variation
Identifying social group constructions in political online discussions: ambiguity, subjectivity, and label variation
Since political discourse often revolves around determining which social groups are deemed deserving of resources, political social media discussions often reference them. Thereby, “social group” is a broad concept referring to collections of individuals who are categorized based on diverse criteria or attributions. Thus, social group mentions in text are prime examples of complex and ambiguous social concepts that have only recently garnered attention in natural language processing research, particularly within the emerging field of perspectivism. Following this line of research, we propose that disagreements between annotators provide important insights into what constitutes a social group. By analyzing diverging annotation judgments, we develop a taxonomy of label variation that distinguishes between ambiguities in identifying social group mentions and genuine annotation errors. We further investigate whether specific types of social group mentions are related to distinct types of label variation. Our findings reveal that linguistic and interpretative ambiguities represent a majority of variation when it comes to subjective and ambiguous concepts, underscoring the complex and multifaceted nature of social group mentions. This observation aligns with the perspectivist view, which posits that annotators’ diverse perspectives can enrich labeled data by capturing multiple valid interpretations rather than forcing a single “correct” ground truth. To explore this, we present Reddit-Social Group Mentions (Reddit-SGM), a novel, non-aggregated dataset for studying social group mentions in texts. By embracing the perspectivist approach, our dataset lays the groundwork for the automatic identification of social group mentions, enabling a nuanced representation of this inherently ambiguous concept.
14-33
Jalali, Farane
ce51fa73-5363-47ad-8b97-cf9b73a3c2d4
Hanke, Sara
74e77d43-1b83-449f-a83d-b1f282d791aa
Heiberger, Raphael
8fd8a372-a817-46c2-9de3-49c6f1e8c50e
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
26 March 2026
Jalali, Farane
ce51fa73-5363-47ad-8b97-cf9b73a3c2d4
Hanke, Sara
74e77d43-1b83-449f-a83d-b1f282d791aa
Heiberger, Raphael
8fd8a372-a817-46c2-9de3-49c6f1e8c50e
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Jalali, Farane, Hanke, Sara, Heiberger, Raphael and Staab, Steffen
(2026)
Identifying social group constructions in political online discussions: ambiguity, subjectivity, and label variation.
Journal of Social Computing, 17 (1), .
(doi:10.23919/JSC.2026.0006).
Abstract
Since political discourse often revolves around determining which social groups are deemed deserving of resources, political social media discussions often reference them. Thereby, “social group” is a broad concept referring to collections of individuals who are categorized based on diverse criteria or attributions. Thus, social group mentions in text are prime examples of complex and ambiguous social concepts that have only recently garnered attention in natural language processing research, particularly within the emerging field of perspectivism. Following this line of research, we propose that disagreements between annotators provide important insights into what constitutes a social group. By analyzing diverging annotation judgments, we develop a taxonomy of label variation that distinguishes between ambiguities in identifying social group mentions and genuine annotation errors. We further investigate whether specific types of social group mentions are related to distinct types of label variation. Our findings reveal that linguistic and interpretative ambiguities represent a majority of variation when it comes to subjective and ambiguous concepts, underscoring the complex and multifaceted nature of social group mentions. This observation aligns with the perspectivist view, which posits that annotators’ diverse perspectives can enrich labeled data by capturing multiple valid interpretations rather than forcing a single “correct” ground truth. To explore this, we present Reddit-Social Group Mentions (Reddit-SGM), a novel, non-aggregated dataset for studying social group mentions in texts. By embracing the perspectivist approach, our dataset lays the groundwork for the automatic identification of social group mentions, enabling a nuanced representation of this inherently ambiguous concept.
Text
Identifying Social Group Mentions in Online Political Discussions_ Ambiguity, Subjectivity, and Label Variation
- Version of Record
More information
Accepted/In Press date: 21 January 2026
Published date: 26 March 2026
Identifiers
Local EPrints ID: 511651
URI: http://eprints.soton.ac.uk/id/eprint/511651
ISSN: 2688-5255
PURE UUID: 481a18bc-6444-4f01-b42f-a9e6474f95b6
Catalogue record
Date deposited: 26 May 2026 16:56
Last modified: 27 May 2026 01:48
Export record
Altmetrics
Contributors
Author:
Farane Jalali
Author:
Sara Hanke
Author:
Raphael Heiberger
Author:
Steffen Staab
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics