The University of Southampton
University of Southampton Institutional Repository

Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history

Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history
Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history
The age of digitally accessible datasets has transformed palaeontology, enabling previously impossible macroevolutionary insights. However, a substantial reservoir of generally inaccessible ‘dark data’ resides within museum collections, which may alter our understanding of ancient groups and their ecological and evolutionary history. We demonstrate how the addition of data held exclusively in museums impacts our macroevolutionary understanding of an entire taxonomic group, using a dataset of Palaeozoic echinoids containing the majority of museum occurrences for the clade. We find that museum ‘dark data’ shows clear differences in composition compared to data available in the published literature and strongly impacts biogeographic patterns, increasing the average geographic range size of taxa by 35%. Global model results assessing drivers of diversity are also significantly affected by the addition of museum-only data. Conversely, ‘dark data’ have a more limited impact on the temporal ranges of taxa or estimates of overall diversity and are impacted by similar socio-geographic biases as the published record. These findings show that unpublished museum data are necessary to obtain a complete understanding of macroevolutionary patterns in deep-time, illustrating the importance of the collection, curation, digitization and continued care of ‘dark data’ in the age of ‘Big Data’ in palaeobiology.
Paleobiology Database, big data, collections, curation, fossil record bias
0962-8452
Dean, Christopher D.
10359186-2bda-4379-bb01-592483d9305d
Thompson, Jeffrey R.
d2c9b7bb-3e33-4918-97c8-0c36e7af30a4
Dean, Christopher D.
10359186-2bda-4379-bb01-592483d9305d
Thompson, Jeffrey R.
d2c9b7bb-3e33-4918-97c8-0c36e7af30a4

Dean, Christopher D. and Thompson, Jeffrey R. (2025) Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history. Proceedings of the Royal Society B: Biological Sciences, 292 (2041), [20242481]. (doi:10.1098/rspb.2024.2481).

Record type: Article

Abstract

The age of digitally accessible datasets has transformed palaeontology, enabling previously impossible macroevolutionary insights. However, a substantial reservoir of generally inaccessible ‘dark data’ resides within museum collections, which may alter our understanding of ancient groups and their ecological and evolutionary history. We demonstrate how the addition of data held exclusively in museums impacts our macroevolutionary understanding of an entire taxonomic group, using a dataset of Palaeozoic echinoids containing the majority of museum occurrences for the clade. We find that museum ‘dark data’ shows clear differences in composition compared to data available in the published literature and strongly impacts biogeographic patterns, increasing the average geographic range size of taxa by 35%. Global model results assessing drivers of diversity are also significantly affected by the addition of museum-only data. Conversely, ‘dark data’ have a more limited impact on the temporal ranges of taxa or estimates of overall diversity and are impacted by similar socio-geographic biases as the published record. These findings show that unpublished museum data are necessary to obtain a complete understanding of macroevolutionary patterns in deep-time, illustrating the importance of the collection, curation, digitization and continued care of ‘dark data’ in the age of ‘Big Data’ in palaeobiology.

Text
dean-thompson-museum-dark-data-show-variable-impacts-on-deep-time-biogeographic-and-evolutionary-history - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 5 February 2025
Published date: 26 February 2025
Keywords: Paleobiology Database, big data, collections, curation, fossil record bias

Identifiers

Local EPrints ID: 502347
URI: http://eprints.soton.ac.uk/id/eprint/502347
ISSN: 0962-8452
PURE UUID: 0aab7b53-ba6f-47ef-a94b-d020140aceed
ORCID for Jeffrey R. Thompson: ORCID iD orcid.org/0000-0003-3485-172X

Catalogue record

Date deposited: 24 Jun 2025 16:31
Last modified: 22 Aug 2025 02:36

Export record

Altmetrics

Contributors

Author: Christopher D. Dean
Author: Jeffrey R. Thompson ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×