The University of Southampton
University of Southampton Institutional Repository

Empirical study on the usage of graph query languages in open source Java projects

Empirical study on the usage of graph query languages in open source Java projects
Empirical study on the usage of graph query languages in open source Java projects
Graph data models are interesting in various domains, in part because of the intuitiveness and flexibility they offer compared to relational models. Specialized query languages, such as Cypher for property graphs or SPARQL for RDF, facilitate their use. In this paper, we present an empirical study on the usage of graph-based query languages in open-source Java projects on GitHub. We investigate the usage of SPARQL, Cypher, Gremlin and GraphQL in terms of popularity and their development over time. We select repositories based on dependencies related to these technologies and employ various popularity and source-code based filters and ranking features for a targeted selection of projects. For the concrete languages SPARQL and Cypher, we analyze the activity of repositories over time. For SPARQL, we investigate common application domains, query use and existence of ontological data modeling in applications that query for concrete instance data. Our results show, that the usage of graph query languages in open-source projects increased over the last years, with SPARQL and Cypher being by far the most popular. SPARQL projects are more active in terms of query related artifact changes and unique developers involved, but Cypher is catching up. Relatively few applications use SPARQL to query for concrete instance data: A majority of those applications employ multiple different ontologies, including project and domain specific ones. Common application domains are management systems and data visualization tools.
Seifer, Philipp
cf4777a9-262b-4de1-bda4-46e4a989bfbc
Härtel, Johannes
0b3c5a6f-3e70-4640-a8e0-05cd54af915e
Leinberger, Martin
a660c10c-241c-4e0e-8cd1-f03cc8d98bd2
Lämmel, Ralf
88466e78-a512-4c70-ad31-7d50c4006b7f
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Seifer, Philipp
cf4777a9-262b-4de1-bda4-46e4a989bfbc
Härtel, Johannes
0b3c5a6f-3e70-4640-a8e0-05cd54af915e
Leinberger, Martin
a660c10c-241c-4e0e-8cd1-f03cc8d98bd2
Lämmel, Ralf
88466e78-a512-4c70-ad31-7d50c4006b7f
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Seifer, Philipp, Härtel, Johannes, Leinberger, Martin, Lämmel, Ralf and Staab, Steffen (2019) Empirical study on the usage of graph query languages in open source Java projects. 12th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2019), , Athens, Greece. 20 - 25 Oct 2019. 15 pp . (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

Graph data models are interesting in various domains, in part because of the intuitiveness and flexibility they offer compared to relational models. Specialized query languages, such as Cypher for property graphs or SPARQL for RDF, facilitate their use. In this paper, we present an empirical study on the usage of graph-based query languages in open-source Java projects on GitHub. We investigate the usage of SPARQL, Cypher, Gremlin and GraphQL in terms of popularity and their development over time. We select repositories based on dependencies related to these technologies and employ various popularity and source-code based filters and ranking features for a targeted selection of projects. For the concrete languages SPARQL and Cypher, we analyze the activity of repositories over time. For SPARQL, we investigate common application domains, query use and existence of ontological data modeling in applications that query for concrete instance data. Our results show, that the usage of graph query languages in open-source projects increased over the last years, with SPARQL and Cypher being by far the most popular. SPARQL projects are more active in terms of query related artifact changes and unique developers involved, but Cypher is catching up. Relatively few applications use SPARQL to query for concrete instance data: A majority of those applications employ multiple different ontologies, including project and domain specific ones. Common application domains are management systems and data visualization tools.

Text
camera-ready-sle19 - Accepted Manuscript
Download (2MB)

More information

Accepted/In Press date: 12 September 2019
Venue - Dates: 12th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2019), , Athens, Greece, 2019-10-20 - 2019-10-25

Identifiers

Local EPrints ID: 434439
URI: http://eprints.soton.ac.uk/id/eprint/434439
PURE UUID: 4663e418-ae0e-4767-8286-8969f243016f
ORCID for Steffen Staab: ORCID iD orcid.org/0000-0002-0780-4154

Catalogue record

Date deposited: 23 Sep 2019 16:31
Last modified: 16 Mar 2024 04:22

Export record

Contributors

Author: Philipp Seifer
Author: Johannes Härtel
Author: Martin Leinberger
Author: Ralf Lämmel
Author: Steffen Staab ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×