The University of Southampton
University of Southampton Institutional Repository

Towards the domain agnostic generation of natural language explanations from provenance graphs for casual users

Towards the domain agnostic generation of natural language explanations from provenance graphs for casual users
Towards the domain agnostic generation of natural language explanations from provenance graphs for casual users
As more systems become PROV-enabled, there will be a cor- responding increase in the need to communicate provenance data directly to users. Whilst there are a number of existing methods for doing this — formally, diagrammatically, and textually — there are currently no application-generic techniques for generating linguistic explanations of provenance. The principal reason for this is that a certain amount of linguistic information is required to transform a provenance graph — such as in PROV — into a textual explanation, and if this information is not available as an annotation, this transformation is presently not possible. In this paper, we describe how we have adapted the common ‘consensus’ architecture from the field of natural language generation to achieve this graph transformation, resulting in the novel PROVglish architecture. We then present an approach to garnering the necessary linguistic information from a PROV dataset, which involves exploiting the linguistic information informally encoded in the URIs denoting provenance resources. We finish by detailing an evaluation undertaken to assess the effectiveness of this approach to lexicalisation, demonstrating a significant improvement in terms of fluency, comprehensibility, and grammatical correctness.
95-106
Springer
Richardson, Darren P.
f55f06e8-4f92-4399-b365-558b4e64d65d
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Mattoso, M
Glavic, B
Richardson, Darren P.
f55f06e8-4f92-4399-b365-558b4e64d65d
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Mattoso, M
Glavic, B

Richardson, Darren P. and Moreau, Luc (2016) Towards the domain agnostic generation of natural language explanations from provenance graphs for casual users. Mattoso, M and Glavic, B (eds.) In Provenance and Annotation of Data and Processes. IPAW 2016. vol. 9672, Springer. pp. 95-106 . (doi:10.1007/978-3-319-40593-3_8).

Record type: Conference or Workshop Item (Paper)

Abstract

As more systems become PROV-enabled, there will be a cor- responding increase in the need to communicate provenance data directly to users. Whilst there are a number of existing methods for doing this — formally, diagrammatically, and textually — there are currently no application-generic techniques for generating linguistic explanations of provenance. The principal reason for this is that a certain amount of linguistic information is required to transform a provenance graph — such as in PROV — into a textual explanation, and if this information is not available as an annotation, this transformation is presently not possible. In this paper, we describe how we have adapted the common ‘consensus’ architecture from the field of natural language generation to achieve this graph transformation, resulting in the novel PROVglish architecture. We then present an approach to garnering the necessary linguistic information from a PROV dataset, which involves exploiting the linguistic information informally encoded in the URIs denoting provenance resources. We finish by detailing an evaluation undertaken to assess the effectiveness of this approach to lexicalisation, demonstrating a significant improvement in terms of fluency, comprehensibility, and grammatical correctness.

Text
dr_ipaw16.pdf - Accepted Manuscript
Download (486kB)
Text
ipaw.pdf - Other
Available under License Creative Commons Attribution Share Alike.
Download (5MB)

More information

Submitted date: 7 March 2016
Accepted/In Press date: 10 April 2016
e-pub ahead of print date: 4 June 2016
Additional Information: Funded by International Technology Alliance in Network and Information Sciences: International Technology Alliance in Network and Information Sciences Agreement (W911NF-06-3-0001)
Venue - Dates: 6th International Provenance & Annotation Workshop (IPAW'16), McLean, VA, United States, 2016-06-06 - 2016-06-09
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 391910
URI: http://eprints.soton.ac.uk/id/eprint/391910
PURE UUID: 42b83dd1-56bd-44f0-8e6c-4e560072c037
ORCID for Luc Moreau: ORCID iD orcid.org/0000-0002-3494-120X

Catalogue record

Date deposited: 19 Apr 2016 15:39
Last modified: 15 Mar 2024 18:28

Export record

Altmetrics

Contributors

Author: Darren P. Richardson
Author: Luc Moreau ORCID iD
Editor: M Mattoso
Editor: B Glavic

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×