Large language models to make museum archive collections more accessible

Keywords are essential to the searchability and therefore discoverability of museum and archival collections in the modern world. Without them, the collection management systems (CMS) and online collections these cultural organisations rely on to record, organise, and make their collections accessible, do not operate efficiently. However, generating these keywords manually is time consuming for these already resource strapped organisations. Artificial intelligence (AI), particularly generative AI and Large Language Models (LLMs), could hold the key to generating, even automating, this key data and as such be considered a co-creative add-on. This study contributes to the literature by introducing the use of Meta’s open-source LLM, Llama, to generate keywords from curator/archivist written descriptions of museum and archival collection items. Our findings suggest that these technologies add significant value compared to current manual methods for keyword generation. In particular, we find that through using carefully crafted prompts, successful keyword augmentations could be established making museum and archival collections much more accessible to wider and more diverse audiences. However, the results also showed that generative AI has biases (e.g., hallucinations, over generalisations, outdated language), though the frequency of occurrence was not as high as general perception may insist. Hence, we also discuss mitigation strategies to address these and how cultural institutions can recognise the risks and errors while getting the most from the systems. Finally, we discuss options to achieve structured results which allow easier ingestion of data back into CMS. Ultimately, LLMs hold significant potential to enhance accessibility to museum and archival collections, yet they are not without imperfection as we extensively discuss.

Generative AI, Keyword augmentation, Keyword generation, Large Language Models, Museum and archive collections

10.1007/s00146-025-02227-8

0951-5666

4485-4497

Reusens, Manon

3dc14c4b-793a-41d6-b7bd-64303cda1c42

Adams, A.

2dd7d783-8b5b-42c0-8f85-b6f6447d519f

Baesens, Bart

f7c6496b-aa7f-4026-8616-ca61d9e216f0

27 February 2025

Reusens, Manon

3dc14c4b-793a-41d6-b7bd-64303cda1c42

Adams, A.

2dd7d783-8b5b-42c0-8f85-b6f6447d519f

Baesens, Bart

f7c6496b-aa7f-4026-8616-ca61d9e216f0

Reusens, Manon, Adams, A. and Baesens, Bart (2025) Large language models to make museum archive collections more accessible. AI & Society Journal of Knowledge, Culture and Communication, 40 (6), 4485-4497. (doi:10.1007/s00146-025-02227-8).

Record type: Article

Abstract

Text

BB manuscript - Accepted Manuscript

Available under License Other.

Download (668kB)

More information

Accepted/In Press date: 23 January 2025

Published date: 27 February 2025

Keywords: Generative AI, Keyword augmentation, Keyword generation, Large Language Models, Museum and archive collections

Identifiers

Local EPrints ID: 505051

URI: http://eprints.soton.ac.uk/id/eprint/505051

DOI: doi:10.1007/s00146-025-02227-8

ISSN: 0951-5666

PURE UUID: acf16f9e-6d40-4da2-a30a-0e87eb1c8ce5

ORCID for Bart Baesens:

orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 25 Sep 2025 16:35

Last modified: 23 Jan 2026 05:01

Export record

Altmetrics

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Manon Reusens

Author: A. Adams

Author: Bart Baesens

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information