Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history
Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history
The age of digitally accessible datasets has transformed palaeontology, enabling previously impossible macroevolutionary insights. However, a substantial reservoir of generally inaccessible ‘dark data’ resides within museum collections, which may alter our understanding of ancient groups and their ecological and evolutionary history. We demonstrate how the addition of data held exclusively in museums impacts our macroevolutionary understanding of an entire taxonomic group, using a dataset of Palaeozoic echinoids containing the majority of museum occurrences for the clade. We find that museum ‘dark data’ shows clear differences in composition compared to data available in the published literature and strongly impacts biogeographic patterns, increasing the average geographic range size of taxa by 35%. Global model results assessing drivers of diversity are also significantly affected by the addition of museum-only data. Conversely, ‘dark data’ have a more limited impact on the temporal ranges of taxa or estimates of overall diversity and are impacted by similar socio-geographic biases as the published record. These findings show that unpublished museum data are necessary to obtain a complete understanding of macroevolutionary patterns in deep-time, illustrating the importance of the collection, curation, digitization and continued care of ‘dark data’ in the age of ‘Big Data’ in palaeobiology.
Paleobiology Database, big data, collections, curation, fossil record bias
Dean, Christopher D.
10359186-2bda-4379-bb01-592483d9305d
Thompson, Jeffrey R.
d2c9b7bb-3e33-4918-97c8-0c36e7af30a4
26 February 2025
Dean, Christopher D.
10359186-2bda-4379-bb01-592483d9305d
Thompson, Jeffrey R.
d2c9b7bb-3e33-4918-97c8-0c36e7af30a4
Dean, Christopher D. and Thompson, Jeffrey R.
(2025)
Museum ‘dark data’ show variable impacts on deep-time biogeographic and evolutionary history.
Proceedings of the Royal Society B: Biological Sciences, 292 (2041), [20242481].
(doi:10.1098/rspb.2024.2481).
Abstract
The age of digitally accessible datasets has transformed palaeontology, enabling previously impossible macroevolutionary insights. However, a substantial reservoir of generally inaccessible ‘dark data’ resides within museum collections, which may alter our understanding of ancient groups and their ecological and evolutionary history. We demonstrate how the addition of data held exclusively in museums impacts our macroevolutionary understanding of an entire taxonomic group, using a dataset of Palaeozoic echinoids containing the majority of museum occurrences for the clade. We find that museum ‘dark data’ shows clear differences in composition compared to data available in the published literature and strongly impacts biogeographic patterns, increasing the average geographic range size of taxa by 35%. Global model results assessing drivers of diversity are also significantly affected by the addition of museum-only data. Conversely, ‘dark data’ have a more limited impact on the temporal ranges of taxa or estimates of overall diversity and are impacted by similar socio-geographic biases as the published record. These findings show that unpublished museum data are necessary to obtain a complete understanding of macroevolutionary patterns in deep-time, illustrating the importance of the collection, curation, digitization and continued care of ‘dark data’ in the age of ‘Big Data’ in palaeobiology.
Text
dean-thompson-museum-dark-data-show-variable-impacts-on-deep-time-biogeographic-and-evolutionary-history
- Version of Record
More information
Accepted/In Press date: 5 February 2025
Published date: 26 February 2025
Keywords:
Paleobiology Database, big data, collections, curation, fossil record bias
Identifiers
Local EPrints ID: 502347
URI: http://eprints.soton.ac.uk/id/eprint/502347
ISSN: 0962-8452
PURE UUID: 0aab7b53-ba6f-47ef-a94b-d020140aceed
Catalogue record
Date deposited: 24 Jun 2025 16:31
Last modified: 22 Aug 2025 02:36
Export record
Altmetrics
Contributors
Author:
Christopher D. Dean
Author:
Jeffrey R. Thompson
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics