The University of Southampton
University of Southampton Institutional Repository

Toward a common standard for data and specimen provenance in life sciences

Toward a common standard for data and specimen provenance in life sciences
Toward a common standard for data and specimen provenance in life sciences

Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.

International Organization for Standardization, biotechnology, provenance information, standardization
2379-6146
Wittner, Rudolf
53f7f651-fdbd-4634-8d66-4ab0b16eb510
Holub, Petr
9bbd41d7-fed5-403d-8ea7-da5a569ba794
Mascia, Cecilia
c8ea7f4e-b5e4-44e9-8831-23ae99f6792f
Frexia, Francesca
d52d7b84-27c8-4e13-b12e-f4c0f42d6a20
Müller, Heimo
f85d3005-42bb-4508-868d-25fc9a178820
Plass, Markus
eaf864fc-6c8b-4e36-b225-2567c3d75760
Allocca, Clare
55e4898c-cb60-4115-8023-be121d974cdc
Betsou, Fay
07736916-6122-49a8-8d14-965435b6001b
Burdett, Tony
ce453c4b-5a65-4263-b6b6-b4b01a90ce1f
Cancio, Ibon
516db050-f79d-4ded-ac6c-fc030c5066a8
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Chapman, Martin
6c049788-c725-4652-a5d6-2f87a9c40e47
Courtot, Mélanie
cab153a6-6ff9-4322-8ae7-159c745c55c1
Curcin, Vasa
01785e44-13a6-4f9a-8e7b-2b2f499cafc9
Eder, Johan
55a0a22e-1905-4d06-8880-8d91f6f1d05c
Elliot, Mark
9e1ebadd-3d0b-4fc7-9f13-fcaba6567b23
Exter, Katrina
b7a52498-864b-4875-81cc-75b2fb34ddd3
Goble, Carole
bc61b2e6-d760-431f-b7e8-9c31e1c74cb0
Golebiewski, Martin
0dfe0c48-d229-46fd-8ff0-a2b5bc124b10
Kisler, Bron
28f480b4-d384-427e-b280-a9931e48373a
Kremer, Andreas
a00eff53-03ce-430a-98bc-f6505bce78ff
Leo, Simone
14e932a9-90cc-4209-9d09-adec2b1297f5
Lin-Gibson, Sheng
486adb49-799e-4f17-9ac1-5a578c81c700
Marsano, Anna
45f6ecdb-7970-466a-87e2-814e3ed34ce1
Mattavelli, Marco
c94577df-98d0-49b8-b365-93443cfa16f8
Moore, Josh
f72c33c5-20a3-46d8-8d7d-59334025a8ca
Nakae, Hiroki
35cb13f0-862b-4716-bc34-67ab3ecff3be
Perseil, Isabelle
f281acb4-bd21-43ca-8b7c-65ec7c56ef26
Salman, Ayat
c929da77-c5a9-488f-bb7b-38b8e0dbb989
Sluka, James
fa5c1cff-3588-469b-b249-bddbdc032451
Soiland-Reyes, Stian
443be5f3-152a-4b82-94d8-8eb5c558d1cd
Strambio-De-Castillia, Caterina
f50bc8b7-8147-4bb7-ad7d-9fef34c2a196
Sussman, Michael
42ce5e47-80bd-47e5-b0a9-4dd8a72d9941
Swedlow, Jason R.
bdcb3dd3-3732-45d6-ae75-64a7aaf32fe7
Zatloukal, Kurt
579ffee2-5177-42c8-8ff8-d97cfc095e2e
Geiger, Jörg
90a67a00-612f-4f46-abb7-02e4c4760b8d
Wittner, Rudolf
53f7f651-fdbd-4634-8d66-4ab0b16eb510
Holub, Petr
9bbd41d7-fed5-403d-8ea7-da5a569ba794
Mascia, Cecilia
c8ea7f4e-b5e4-44e9-8831-23ae99f6792f
Frexia, Francesca
d52d7b84-27c8-4e13-b12e-f4c0f42d6a20
Müller, Heimo
f85d3005-42bb-4508-868d-25fc9a178820
Plass, Markus
eaf864fc-6c8b-4e36-b225-2567c3d75760
Allocca, Clare
55e4898c-cb60-4115-8023-be121d974cdc
Betsou, Fay
07736916-6122-49a8-8d14-965435b6001b
Burdett, Tony
ce453c4b-5a65-4263-b6b6-b4b01a90ce1f
Cancio, Ibon
516db050-f79d-4ded-ac6c-fc030c5066a8
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Chapman, Martin
6c049788-c725-4652-a5d6-2f87a9c40e47
Courtot, Mélanie
cab153a6-6ff9-4322-8ae7-159c745c55c1
Curcin, Vasa
01785e44-13a6-4f9a-8e7b-2b2f499cafc9
Eder, Johan
55a0a22e-1905-4d06-8880-8d91f6f1d05c
Elliot, Mark
9e1ebadd-3d0b-4fc7-9f13-fcaba6567b23
Exter, Katrina
b7a52498-864b-4875-81cc-75b2fb34ddd3
Goble, Carole
bc61b2e6-d760-431f-b7e8-9c31e1c74cb0
Golebiewski, Martin
0dfe0c48-d229-46fd-8ff0-a2b5bc124b10
Kisler, Bron
28f480b4-d384-427e-b280-a9931e48373a
Kremer, Andreas
a00eff53-03ce-430a-98bc-f6505bce78ff
Leo, Simone
14e932a9-90cc-4209-9d09-adec2b1297f5
Lin-Gibson, Sheng
486adb49-799e-4f17-9ac1-5a578c81c700
Marsano, Anna
45f6ecdb-7970-466a-87e2-814e3ed34ce1
Mattavelli, Marco
c94577df-98d0-49b8-b365-93443cfa16f8
Moore, Josh
f72c33c5-20a3-46d8-8d7d-59334025a8ca
Nakae, Hiroki
35cb13f0-862b-4716-bc34-67ab3ecff3be
Perseil, Isabelle
f281acb4-bd21-43ca-8b7c-65ec7c56ef26
Salman, Ayat
c929da77-c5a9-488f-bb7b-38b8e0dbb989
Sluka, James
fa5c1cff-3588-469b-b249-bddbdc032451
Soiland-Reyes, Stian
443be5f3-152a-4b82-94d8-8eb5c558d1cd
Strambio-De-Castillia, Caterina
f50bc8b7-8147-4bb7-ad7d-9fef34c2a196
Sussman, Michael
42ce5e47-80bd-47e5-b0a9-4dd8a72d9941
Swedlow, Jason R.
bdcb3dd3-3732-45d6-ae75-64a7aaf32fe7
Zatloukal, Kurt
579ffee2-5177-42c8-8ff8-d97cfc095e2e
Geiger, Jörg
90a67a00-612f-4f46-abb7-02e4c4760b8d

Wittner, Rudolf, Holub, Petr, Mascia, Cecilia, Frexia, Francesca, Müller, Heimo, Plass, Markus, Allocca, Clare, Betsou, Fay, Burdett, Tony, Cancio, Ibon, Chapman, Adriane, Chapman, Martin, Courtot, Mélanie, Curcin, Vasa, Eder, Johan, Elliot, Mark, Exter, Katrina, Goble, Carole, Golebiewski, Martin, Kisler, Bron, Kremer, Andreas, Leo, Simone, Lin-Gibson, Sheng, Marsano, Anna, Mattavelli, Marco, Moore, Josh, Nakae, Hiroki, Perseil, Isabelle, Salman, Ayat, Sluka, James, Soiland-Reyes, Stian, Strambio-De-Castillia, Caterina, Sussman, Michael, Swedlow, Jason R., Zatloukal, Kurt and Geiger, Jörg (2023) Toward a common standard for data and specimen provenance in life sciences. Learning Health Systems, [e10365]. (doi:10.1002/lrh2.10365).

Record type: Letter

Abstract

Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.

Text
Learning Health Systems Wittner 2023 - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 24 March 2023
e-pub ahead of print date: 18 April 2023
Published date: 18 April 2023
Additional Information: Funding Information: This work has been co‐funded by EOSC‐Life supported by EU Horizon 2020, grant agreement no. 824087; EJP‐RD supported by EU Horizon 2020, grant agreement no. 825575; BioExcel‐2 supported by EU Horizon 2020, grant agreement no. 823830; the PAM and the XDATA Projects, funded by the Sardinian Regional Authority. VC and MCh are supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based at Guy's and St Thomas’ National Health Service (NHS) Foundation Trust and King's College London (RJ112/N027) and by NIHR Application Research Collaboration South London (ARC SL). TB, MCo acknowledges funding from EMBL‐EBI Core Funds and the FAIRplus project (H2020 No 802750). MCo was supported by Wellcome Trust GA4GH award number 201535/Z/16/Z and the CINECA project (H2020 No 825775). AC was supported by EPSRC (EP/S028366/1). JS was supported by the US National Institute of Health (U24 EB028887, R01 GM122424, and OT2OD026671), the US National Science Foundation (NSF 2054061), and the US EPA (RD840027). ME was supported by the Alan Turing Institute (ProvAnon). KZ was supported by the Bundesministerium für Bildung, Wissenschaft und Forschung (Federal Ministry of Education, Science and Research of Austria) (BMBWF‐10.470/0010‐V/3c/2018). CS was supported by NIH grant #U01CA200059 and by grant #2019‐198155 (5022) awarded by the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation, as part of their Imaging Scientist Program. The opinions in this paper are those of the authors and do not necessarily reflect the opinions of the funders. Representation of communities: The co‐author's team represents a wide coverage of life‐sciences communities. PH, RW, CM, FF, HM, MP, and JG come from human biobanking and biomolecular resources communities, BBMRI‐ERIC Research Infrastructure, and are directly involved as experts in the ISO standardization process. KZ and JE come from cancer research, biobanking, and medical informatics and are long‐term contributors to data quality standardization efforts. TB, MCo is a director of Ontario Institute for Cancer Research. IC and KE come from marine biology and EMBRC Research Infrastructure. CG and SSR have worked with bioinformatics, CWL, and RO‐Crate. JRS and JM come from bio‐imaging communities and EUBioImaging Research Infrastructure. VC and MCh come from health informatics. HN participates in the provenance standardization process as an expert from Japan, MS and JS as experts from the United States, and AK as an expert from Luxembourg. ME contributes to privacy protection and provenance aspects. FB is a biobanking expert and director of the microbiological resource center CRBIP, Institut Pasteur. AS is a biobanking expert and ESBB councilor. SL‐G and CA are from NIST and convenor and secretary of ISO/TC 276/WG 3 “Analytical Methods.” AM belongs to the tissue engineering and biomedical research community. MM is a standard expert in the digital media, genomic sequencing, and annotation data fields, and convenor of ISO/IEC SC29/WG 8 “MPEG Genomic Coding.” AC contributes to the capture and handling of provenance within large organizations. CS is a Cell Biologists actively engaged in the development of quality control and reproducibility specifications and tools for light microscopy as a member of the Data Coordination and Integration Center of the NIH‐funded 4D Nucleome initiative, Chair of the Quality Control and Data Management WG of BioImaging North America, and Co‐Chair of the WG on Metadata (WG7) of the QUality Assessment and REProducibility for Instruments and Images in Light‐Microscopy (QUAREP‐LiMI) initiative. SLe is a member of the RO‐Crate community and co‐chair of a working group for the development of an RO‐Crate profile for capturing the provenance of scientific workflow executions. Publisher Copyright: © 2023 The Authors. Learning Health Systems published by Wiley Periodicals LLC on behalf of University of Michigan.
Keywords: International Organization for Standardization, biotechnology, provenance information, standardization

Identifiers

Local EPrints ID: 477634
URI: http://eprints.soton.ac.uk/id/eprint/477634
ISSN: 2379-6146
PURE UUID: 8a06e7e4-01c8-4d29-9acf-7f0e4a42ba99
ORCID for Adriane Chapman: ORCID iD orcid.org/0000-0002-3814-2587

Catalogue record

Date deposited: 12 Jun 2023 16:30
Last modified: 17 Mar 2024 03:46

Export record

Altmetrics

Contributors

Author: Rudolf Wittner
Author: Petr Holub
Author: Cecilia Mascia
Author: Francesca Frexia
Author: Heimo Müller
Author: Markus Plass
Author: Clare Allocca
Author: Fay Betsou
Author: Tony Burdett
Author: Ibon Cancio
Author: Adriane Chapman ORCID iD
Author: Martin Chapman
Author: Mélanie Courtot
Author: Vasa Curcin
Author: Johan Eder
Author: Mark Elliot
Author: Katrina Exter
Author: Carole Goble
Author: Martin Golebiewski
Author: Bron Kisler
Author: Andreas Kremer
Author: Simone Leo
Author: Sheng Lin-Gibson
Author: Anna Marsano
Author: Marco Mattavelli
Author: Josh Moore
Author: Hiroki Nakae
Author: Isabelle Perseil
Author: Ayat Salman
Author: James Sluka
Author: Stian Soiland-Reyes
Author: Caterina Strambio-De-Castillia
Author: Michael Sussman
Author: Jason R. Swedlow
Author: Kurt Zatloukal
Author: Jörg Geiger

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×