Parallel and streaming truth discovery in large-scale quantitative crowdsourcing
Parallel and streaming truth discovery in large-scale quantitative crowdsourcing
To enable reliable crowdsourcing applications, it is of great importance to develop algorithms that can automatically discover the truths from possibly noisy and conflicting claims provided by various information sources. In order to handle crowdsourcing applications involving big or streaming data, a desirable truth discovery algorithm should not only be effective, but also be scalable. However, with respect to quantitative crowdsourcing applications such as object counting and percentage annotation, existing truth discovery algorithms are not simultaneously effective and scalable. They either address truth discovery in categorical crowdsourcing or perform batch processing that does not scale. In this paper, we propose new parallel and streaming truth discovery algorithms for quantitative crowdsourcing applications. Through extensive experiments on real-world and synthetic datasets, we demonstrate that 1) both of them are quite effective, 2) the parallel algorithm can efficiently perform truth discovery on large datasets, and 3) the streaming algorithm processes data incrementally, and it can efficiently perform truth discovery both on large datasets and in data streams.
2984-2997
Ouyang, Robin Wentao
a0886331-0eed-46ee-9b72-843ef2bb192f
Kaplan, Lance M.
3812423d-c58a-4896-bd57-7b373505e457
Toniolo, Alice
e54ad578-9232-471a-a5d7-cd3a7bc70872
Srivastava, Mani
77c0ad90-6073-4dd8-908f-05087d856cd6
Norman, Timothy
663e522f-807c-4569-9201-dc141c8eb50d
October 2016
Ouyang, Robin Wentao
a0886331-0eed-46ee-9b72-843ef2bb192f
Kaplan, Lance M.
3812423d-c58a-4896-bd57-7b373505e457
Toniolo, Alice
e54ad578-9232-471a-a5d7-cd3a7bc70872
Srivastava, Mani
77c0ad90-6073-4dd8-908f-05087d856cd6
Norman, Timothy
663e522f-807c-4569-9201-dc141c8eb50d
Ouyang, Robin Wentao, Kaplan, Lance M., Toniolo, Alice, Srivastava, Mani and Norman, Timothy
(2016)
Parallel and streaming truth discovery in large-scale quantitative crowdsourcing.
IEEE Transactions on Parallel and Distributed Systems, 27 (10), .
(doi:10.1109/TPDS.2016.2515092).
Abstract
To enable reliable crowdsourcing applications, it is of great importance to develop algorithms that can automatically discover the truths from possibly noisy and conflicting claims provided by various information sources. In order to handle crowdsourcing applications involving big or streaming data, a desirable truth discovery algorithm should not only be effective, but also be scalable. However, with respect to quantitative crowdsourcing applications such as object counting and percentage annotation, existing truth discovery algorithms are not simultaneously effective and scalable. They either address truth discovery in categorical crowdsourcing or perform batch processing that does not scale. In this paper, we propose new parallel and streaming truth discovery algorithms for quantitative crowdsourcing applications. Through extensive experiments on real-world and synthetic datasets, we demonstrate that 1) both of them are quite effective, 2) the parallel algorithm can efficiently perform truth discovery on large datasets, and 3) the streaming algorithm processes data incrementally, and it can efficiently perform truth discovery both on large datasets and in data streams.
Text
TPDS2515092.pdf
- Version of Record
Restricted to Repository staff only
Request a copy
More information
Accepted/In Press date: 25 December 2015
e-pub ahead of print date: 6 January 2016
Published date: October 2016
Organisations:
Agents, Interactions & Complexity
Identifiers
Local EPrints ID: 403234
URI: http://eprints.soton.ac.uk/id/eprint/403234
ISSN: 1045-9219
PURE UUID: 1cb495f3-25b8-4a90-af1f-11fe5d22fc4b
Catalogue record
Date deposited: 28 Nov 2016 16:43
Last modified: 15 Mar 2024 03:53
Export record
Altmetrics
Contributors
Author:
Robin Wentao Ouyang
Author:
Lance M. Kaplan
Author:
Alice Toniolo
Author:
Mani Srivastava
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics