The University of Southampton
University of Southampton Institutional Repository

Studying topical relevance with evidence-based crowdsourcing

Studying topical relevance with evidence-based crowdsourcing
Studying topical relevance with evidence-based crowdsourcing
Information Retrieval systems rely on large test collections to measure their effectiveness in retrieving relevant documents. While the demand is high, the task of creating such test collections is laborious due to the large amounts of data that need to be annotated, and due to the intrinsic subjectivity of the task itself. In this paper we study the topical relevance from a user perspective by addressing the problems of subjectivity and ambiguity. We compare our approach and results with the established TREC annotation guidelines and results. The comparison is based on a series of crowdsourcing pilots experimenting with variables, such as relevance scale, document granularity, annotation template and the number of workers. Our results show correlation between relevance assessment accuracy and smaller document granularity, i.e., aggregation of relevance on paragraph level results in a better relevance accuracy, compared to assessment done at the level of the full document. As expected, our results also show that collecting binary relevance judgments results in a higher accuracy compared to the ternary scale used in the TREC annotation guidelines. Finally, the crowdsourced annotation tasks provided a more accurate document relevance ranking than a single assessor relevance label. This work resulted is a reliable test collection around the TREC Common Core track.
Crowdsourcing, IR evaluation, TREC Common Core track
1253-1262
ACM
Inel, Oana
11745992-ffd4-48b4-a76e-62ebe54ac1ae
Szl vik, Zolt n.
bbc495e9-449e-447d-a581-b1fd3aeaaccc
Haralabopoulos, Giannis
b6c4f479-abc4-463b-a7df-6ee7e4758f53
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Li, Dan
276851c4-71b0-4fa3-b215-f3f1348fd256
Kanoulas, Evangelos
a113688b-584e-4924-b6fe-d39b7c87bd7f
Van Gysel, Christophe
c780a2e2-ba2a-4f82-bcef-0cce10f71935
Aroyo, Lora
6c28464f-23db-427e-aa2c-b43d02922462
Inel, Oana
11745992-ffd4-48b4-a76e-62ebe54ac1ae
Szl vik, Zolt n.
bbc495e9-449e-447d-a581-b1fd3aeaaccc
Haralabopoulos, Giannis
b6c4f479-abc4-463b-a7df-6ee7e4758f53
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Li, Dan
276851c4-71b0-4fa3-b215-f3f1348fd256
Kanoulas, Evangelos
a113688b-584e-4924-b6fe-d39b7c87bd7f
Van Gysel, Christophe
c780a2e2-ba2a-4f82-bcef-0cce10f71935
Aroyo, Lora
6c28464f-23db-427e-aa2c-b43d02922462

Inel, Oana, Szl vik, Zolt n., Haralabopoulos, Giannis, Simperl, Elena, Li, Dan, Kanoulas, Evangelos, Van Gysel, Christophe and Aroyo, Lora (2018) Studying topical relevance with evidence-based crowdsourcing. In CIKM 2018 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM. pp. 1253-1262 . (doi:10.1145/3269206.3271779).

Record type: Conference or Workshop Item (Paper)

Abstract

Information Retrieval systems rely on large test collections to measure their effectiveness in retrieving relevant documents. While the demand is high, the task of creating such test collections is laborious due to the large amounts of data that need to be annotated, and due to the intrinsic subjectivity of the task itself. In this paper we study the topical relevance from a user perspective by addressing the problems of subjectivity and ambiguity. We compare our approach and results with the established TREC annotation guidelines and results. The comparison is based on a series of crowdsourcing pilots experimenting with variables, such as relevance scale, document granularity, annotation template and the number of workers. Our results show correlation between relevance assessment accuracy and smaller document granularity, i.e., aggregation of relevance on paragraph level results in a better relevance accuracy, compared to assessment done at the level of the full document. As expected, our results also show that collecting binary relevance judgments results in a higher accuracy compared to the ternary scale used in the TREC annotation guidelines. Finally, the crowdsourced annotation tasks provided a more accurate document relevance ranking than a single assessor relevance label. This work resulted is a reliable test collection around the TREC Common Core track.

Text
Studying Topical Relevance with Evidence-based Crowdsourcing - Accepted Manuscript
Restricted to Repository staff only
Request a copy
Text
Studying Topical Relevance with Evidence-based Crowdsourcing - Other
Restricted to Repository staff only
Request a copy

More information

Published date: 17 October 2018
Venue - Dates: 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Italy, 2018-10-21 - 2018-10-25
Keywords: Crowdsourcing, IR evaluation, TREC Common Core track

Identifiers

Local EPrints ID: 426236
URI: http://eprints.soton.ac.uk/id/eprint/426236
PURE UUID: 29343890-f116-4502-a4bf-1705d00549aa
ORCID for Elena Simperl: ORCID iD orcid.org/0000-0003-1722-947X

Catalogue record

Date deposited: 20 Nov 2018 17:30
Last modified: 18 Feb 2021 17:20

Export record

Altmetrics

Contributors

Author: Oana Inel
Author: Zolt n. Szl vik
Author: Giannis Haralabopoulos
Author: Elena Simperl ORCID iD
Author: Dan Li
Author: Evangelos Kanoulas
Author: Christophe Van Gysel
Author: Lora Aroyo

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×