Time-sensitive Bayesian information aggregation for crowdsourcing systems
Time-sensitive Bayesian information aggregation for crowdsourcing systems
Many aspects of the design of efficient crowdsourcing processes, such as defining worker’s bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new time–sensitive Bayesian aggregation method that simultaneously estimates a task’s duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers’ completion time, the tasks’ duration and the workers’ accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real- world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task’s duration compared to state–of–the–art methods.
517-545
Venanzi, Matteo
ba24a77f-31a6-4c05-a647-babf8f660440
Guiver, John
8df5d373-1101-4b08-86bb-26a1cbab1c07
Kohli, Pushmeet
ede0d0ca-fe91-49c2-8c42-ee0fa4298e33
Jennings, Nicholas
ab3d94cc-247c-4545-9d1e-65873d6cdb30
2016
Venanzi, Matteo
ba24a77f-31a6-4c05-a647-babf8f660440
Guiver, John
8df5d373-1101-4b08-86bb-26a1cbab1c07
Kohli, Pushmeet
ede0d0ca-fe91-49c2-8c42-ee0fa4298e33
Jennings, Nicholas
ab3d94cc-247c-4545-9d1e-65873d6cdb30
Venanzi, Matteo, Guiver, John, Kohli, Pushmeet and Jennings, Nicholas
(2016)
Time-sensitive Bayesian information aggregation for crowdsourcing systems.
[in special issue: Special Track on Human Computation and AI]
Journal of Artificial Intelligence Research, 56, .
(doi:10.1613/jair.5175).
Abstract
Many aspects of the design of efficient crowdsourcing processes, such as defining worker’s bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new time–sensitive Bayesian aggregation method that simultaneously estimates a task’s duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers’ completion time, the tasks’ duration and the workers’ accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real- world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task’s duration compared to state–of–the–art methods.
Text
timebcc-jair15.pdf
- Accepted Manuscript
More information
Submitted date: 1 April 2016
Accepted/In Press date: 30 April 2016
e-pub ahead of print date: 28 July 2016
Published date: 2016
Organisations:
Agents, Interactions & Complexity
Identifiers
Local EPrints ID: 398414
URI: http://eprints.soton.ac.uk/id/eprint/398414
PURE UUID: f8d8f04e-9277-4fce-b54f-1875a0085086
Catalogue record
Date deposited: 23 Jul 2016 09:51
Last modified: 15 Mar 2024 01:33
Export record
Altmetrics
Contributors
Author:
Matteo Venanzi
Author:
John Guiver
Author:
Pushmeet Kohli
Author:
Nicholas Jennings
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics