The University of Southampton
University of Southampton Institutional Repository

Community-Based Bayesian Aggregation Models for Crowdsourcing

Venanzi, Matteo, John, Guiver, Gabriella, Kazai, Pushmeet, Kohli and Milad, Shokouhi (2014) Community-Based Bayesian Aggregation Models for Crowdsourcing At the 23rd International World Wide Web Conference (WWW 2014). , pp. 155-164. (doi:10.1145/2566486.2567989).

Record type: Conference or Workshop Item (Paper)


This paper addresses the problem of extracting accurate labels from crowdsourced datasets, a key challenge in crowdsourcing. Prior work has focused on modeling the reliability of individual workers, for instance, by way of confusion matrices, and using these latent traits to estimate the true labels more accurately. However, this strategy becomes ineffective when there are too few labels per worker to reliably estimate their quality. To mitigate this issue, we propose a novel community-based Bayesian label aggregation model, CommunityBCC, which assumes that crowd workers conform to a few different types, where each type represents a group of workers with similar confusion matrices. We assume that each worker belongs to a certain community, where the worker’s confusion matrix is similar to (a perturbation of) the community’s confusion matrix. Our model can then learn a set of key latent features: (i) the confusion matrix of each community, (ii) the community membership of each user, and (iii) the aggregated label of each item. We compare the performance of our model against established aggregation methods on a number of large-scale, real-world crowdsourcing datasets. Our experimental results show that our CommunityBCC model consistently outperforms state-of-the-art label aggregation methods, gaining, on average, 8% more accuracy with the same amount of labels.

PDF main.pdf - Other
Download (1MB)

More information

Published date: May 2014
Venue - Dates: the 23rd International World Wide Web Conference (WWW 2014), 2014-05-01
Organisations: Agents, Interactions & Complexity


Local EPrints ID: 362614
PURE UUID: a379a2ce-ac9e-4b16-9a31-851ee73e4a27

Catalogue record

Date deposited: 27 Feb 2014 15:47
Last modified: 18 Jul 2017 02:50

Export record



Author: Matteo Venanzi
Author: Guiver John
Author: Kazai Gabriella
Author: Kohli Pushmeet
Author: Shokouhi Milad

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton:

ePrints Soton supports OAI 2.0 with a base URL of

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.