LLMs for the post-hoc creation of provenance
LLMs for the post-hoc creation of provenance
Provenance information is an essential component that facilitates the reproduction of scientific experiments, the assessment of data quality, and other related tasks. However, provenance capture at observation is sometimes difficult, and post-hoc methods are needed. In this paper, we explore the ability of large language models (LLMs) to access and extract provenance information from scientific papers through a set of specially designed prompts. We then identify and suggest the most effective prompt for provenance extraction from papers. Our findings confirm the capability of ChatGPT-4 in accessing and extracting provenance information from biomedical research papers.
LLMs, provenance
562-566
Almuntashiri, Abdullah Hamed
aa118cfa-3b60-4717-9855-2816bbbb28d0
Ibáñez, Luis-Daniel
65a2e20b-74a9-427d-8c4c-2330285153ed
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
12 July 2024
Almuntashiri, Abdullah Hamed
aa118cfa-3b60-4717-9855-2816bbbb28d0
Ibáñez, Luis-Daniel
65a2e20b-74a9-427d-8c4c-2330285153ed
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Almuntashiri, Abdullah Hamed, Ibáñez, Luis-Daniel and Chapman, Adriane
(2024)
LLMs for the post-hoc creation of provenance.
In 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW): 16th International Workshop on Theory and Practice of Provenance.
.
(doi:10.1109/EuroSPW61312.2024.00068).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Provenance information is an essential component that facilitates the reproduction of scientific experiments, the assessment of data quality, and other related tasks. However, provenance capture at observation is sometimes difficult, and post-hoc methods are needed. In this paper, we explore the ability of large language models (LLMs) to access and extract provenance information from scientific papers through a set of specially designed prompts. We then identify and suggest the most effective prompt for provenance extraction from papers. Our findings confirm the capability of ChatGPT-4 in accessing and extracting provenance information from biomedical research papers.
More information
Published date: 12 July 2024
Keywords:
LLMs, provenance
Identifiers
Local EPrints ID: 492146
URI: http://eprints.soton.ac.uk/id/eprint/492146
PURE UUID: c661437c-be30-4017-b504-f070ac474d01
Catalogue record
Date deposited: 18 Jul 2024 16:31
Last modified: 13 Sep 2024 02:02
Export record
Altmetrics
Contributors
Author:
Abdullah Hamed Almuntashiri
Author:
Luis-Daniel Ibáñez
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics