The University of Southampton
University of Southampton Institutional Repository

Network analysis on provenance graphs from a crowdsourcing application

Network analysis on provenance graphs from a crowdsourcing application
Network analysis on provenance graphs from a crowdsourcing application
Crowdsourcing has become a popular means for quickly achieving various tasks in large quantities. CollabMap is an online mapping application in which we crowdsource the identification of evacuation routes in residential areas to be used for planning large-scale evacuations. So far, approximately 38,000 micro-tasks have been completed by over 100 contributors. In order to assist with data verification, we introduced provenance tracking into the application, and approximately 5,000 provenance graphs have been generated. They have provided us various insights into the typical characteristics of provenance graphs in the crowdsourcing context. In particular, we have estimated probability distribution functions over three selected characteristics of these provenance graphs: the node degree, the graph diameter, and the densification exponent. We describe methods to define these three characteristics across specific combinations of node types and edge types, and present our findings in this paper. Applications of our methods include rapid comparison of one provenance graph versus another, or of one style of provenance database versus another. Our results also indicate that provenance graphs represent a suitable area of exploitation for existing network analysis tools concerned with modelling, prediction, and the inference of missing nodes and edges.
provenance, provenance graphs, network analysis, densification, graph diameters, node degree, collabmap, crowdsourcing, evacuation, maps
978-3-642-34221-9
168-182
Ebden, Mark
f46be90b-365e-4ea3-909a-4b92e4287f68
Huynh, Trung Dong
ddea6cf3-5a82-4c99-8883-7c31cf22dd36
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Ramchurn, Sarvapali
1d62ae2a-a498-444e-912d-a6082d3aaea3
Stephen, Roberts
2cbf759b-0ad7-4f32-908a-f3fcf3312705
Groth, Paul
427b9eca-c4dd-45c1-be04-3c91bb327345
Frew, James
826f996d-9279-4fc3-affb-f5a4994149e8
Ebden, Mark
f46be90b-365e-4ea3-909a-4b92e4287f68
Huynh, Trung Dong
ddea6cf3-5a82-4c99-8883-7c31cf22dd36
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Ramchurn, Sarvapali
1d62ae2a-a498-444e-912d-a6082d3aaea3
Stephen, Roberts
2cbf759b-0ad7-4f32-908a-f3fcf3312705
Groth, Paul
427b9eca-c4dd-45c1-be04-3c91bb327345
Frew, James
826f996d-9279-4fc3-affb-f5a4994149e8

Ebden, Mark, Huynh, Trung Dong, Moreau, Luc, Ramchurn, Sarvapali and Stephen, Roberts (2012) Network analysis on provenance graphs from a crowdsourcing application. Groth, Paul and Frew, James (eds.) 4th International Provenance and Annotation Workshop, Santa Barbara, United States. 20 - 21 Jun 2012. pp. 168-182 . (doi:10.1007/978-3-642-34222-6_13).

Record type: Conference or Workshop Item (Paper)

Abstract

Crowdsourcing has become a popular means for quickly achieving various tasks in large quantities. CollabMap is an online mapping application in which we crowdsource the identification of evacuation routes in residential areas to be used for planning large-scale evacuations. So far, approximately 38,000 micro-tasks have been completed by over 100 contributors. In order to assist with data verification, we introduced provenance tracking into the application, and approximately 5,000 provenance graphs have been generated. They have provided us various insights into the typical characteristics of provenance graphs in the crowdsourcing context. In particular, we have estimated probability distribution functions over three selected characteristics of these provenance graphs: the node degree, the graph diameter, and the densification exponent. We describe methods to define these three characteristics across specific combinations of node types and edge types, and present our findings in this paper. Applications of our methods include rapid comparison of one provenance graph versus another, or of one style of provenance database versus another. Our results also indicate that provenance graphs represent a suitable area of exploitation for existing network analysis tools concerned with modelling, prediction, and the inference of missing nodes and edges.

Text
ipaw2012EbdenHuynh.pdf - Accepted Manuscript
Download (314kB)

More information

Published date: June 2012
Venue - Dates: 4th International Provenance and Annotation Workshop, Santa Barbara, United States, 2012-06-20 - 2012-06-21
Keywords: provenance, provenance graphs, network analysis, densification, graph diameters, node degree, collabmap, crowdsourcing, evacuation, maps
Organisations: Web & Internet Science, Agents, Interactions & Complexity

Identifiers

Local EPrints ID: 340068
URI: http://eprints.soton.ac.uk/id/eprint/340068
ISBN: 978-3-642-34221-9
PURE UUID: 3403ed2a-4b59-45b3-b7ef-2af4575cb010
ORCID for Trung Dong Huynh: ORCID iD orcid.org/0000-0003-4937-2473
ORCID for Luc Moreau: ORCID iD orcid.org/0000-0002-3494-120X
ORCID for Sarvapali Ramchurn: ORCID iD orcid.org/0000-0001-9686-4302

Catalogue record

Date deposited: 11 Jun 2012 15:04
Last modified: 15 Mar 2024 03:22

Export record

Altmetrics

Contributors

Author: Mark Ebden
Author: Trung Dong Huynh ORCID iD
Author: Luc Moreau ORCID iD
Author: Sarvapali Ramchurn ORCID iD
Author: Roberts Stephen
Editor: Paul Groth
Editor: James Frew

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×