The University of Southampton
University of Southampton Institutional Repository

Time-sensitive Bayesian information aggregation for crowdsourcing systems

Time-sensitive Bayesian information aggregation for crowdsourcing systems
Time-sensitive Bayesian information aggregation for crowdsourcing systems
Many aspects of the design of efficient crowdsourcing processes, such as defining worker’s bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new time–sensitive Bayesian aggregation method that simultaneously estimates a task’s duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers’ completion time, the tasks’ duration and the workers’ accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real- world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task’s duration compared to state–of–the–art methods.
517-545
Venanzi, Matteo
ba24a77f-31a6-4c05-a647-babf8f660440
Guiver, John
8df5d373-1101-4b08-86bb-26a1cbab1c07
Kohli, Pushmeet
ede0d0ca-fe91-49c2-8c42-ee0fa4298e33
Jennings, Nicholas
ab3d94cc-247c-4545-9d1e-65873d6cdb30
Venanzi, Matteo
ba24a77f-31a6-4c05-a647-babf8f660440
Guiver, John
8df5d373-1101-4b08-86bb-26a1cbab1c07
Kohli, Pushmeet
ede0d0ca-fe91-49c2-8c42-ee0fa4298e33
Jennings, Nicholas
ab3d94cc-247c-4545-9d1e-65873d6cdb30

Venanzi, Matteo, Guiver, John, Kohli, Pushmeet and Jennings, Nicholas (2016) Time-sensitive Bayesian information aggregation for crowdsourcing systems. [in special issue: Special Track on Human Computation and AI] Journal of Artificial Intelligence Research, 56, 517-545. (doi:10.1613/jair.5175).

Record type: Article

Abstract

Many aspects of the design of efficient crowdsourcing processes, such as defining worker’s bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new time–sensitive Bayesian aggregation method that simultaneously estimates a task’s duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers’ completion time, the tasks’ duration and the workers’ accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real- world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task’s duration compared to state–of–the–art methods.

Text
timebcc-jair15.pdf - Accepted Manuscript
Download (1MB)

More information

Submitted date: 1 April 2016
Accepted/In Press date: 30 April 2016
e-pub ahead of print date: 28 July 2016
Published date: 2016
Organisations: Agents, Interactions & Complexity

Identifiers

Local EPrints ID: 398414
URI: http://eprints.soton.ac.uk/id/eprint/398414
PURE UUID: f8d8f04e-9277-4fce-b54f-1875a0085086

Catalogue record

Date deposited: 23 Jul 2016 09:51
Last modified: 16 Dec 2019 19:48

Export record

Altmetrics

Contributors

Author: Matteo Venanzi
Author: John Guiver
Author: Pushmeet Kohli
Author: Nicholas Jennings

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×