Network analysis on provenance graphs from a crowdsourcing application

Ebden, Mark, Huynh, Trung Dong, Moreau, Luc, Ramchurn, Sarvapali and Stephen, Roberts, (2012) Network analysis on provenance graphs from a crowdsourcing application Groth, Paul and Frew, James (eds.) At 4th International Provenance and Annotation Workshop, United States. 20 - 21 Jun 2012. 15 pp, pp. 168-182. (doi:10.1007/978-3-642-34222-6_13).


[img] PDF ipaw2012EbdenHuynh.pdf - Accepted Manuscript
Available under License University of Southampton Accepted Manuscript Licence.

Download (314kB)


Crowdsourcing has become a popular means for quickly achieving various tasks in large quantities. CollabMap is an online mapping application in which we crowdsource the identification of evacuation routes in residential areas to be used for planning large-scale evacuations. So far, approximately 38,000 micro-tasks have been completed by over 100 contributors. In order to assist with data verification, we introduced provenance tracking into the application, and approximately 5,000 provenance graphs have been generated. They have provided us various insights into the typical characteristics of provenance graphs in the crowdsourcing context. In particular, we have estimated probability distribution functions over three selected characteristics of these provenance graphs: the node degree, the graph diameter, and the densification exponent. We describe methods to define these three characteristics across specific combinations of node types and edge types, and present our findings in this paper. Applications of our methods include rapid comparison of one provenance graph versus another, or of one style of provenance database versus another. Our results also indicate that provenance graphs represent a suitable area of exploitation for existing network analysis tools concerned with modelling, prediction, and the inference of missing nodes and edges.

Item Type: Conference or Workshop Item (Paper)
Digital Object Identifier (DOI): doi:10.1007/978-3-642-34222-6_13
Venue - Dates: 4th International Provenance and Annotation Workshop, United States, 2012-06-20 - 2012-06-21
Keywords: provenance, provenance graphs, network analysis, densification, graph diameters, node degree, collabmap, crowdsourcing, evacuation, maps
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Organisations: Web & Internet Science, Agents, Interactions & Complexity
ePrint ID: 340068
Date :
Date Event
June 2012Published
Date Deposited: 11 Jun 2012 15:04
Last Modified: 22 Apr 2017 02:31
Further Information:Google Scholar

Actions (login required)

View Item View Item