The University of Southampton
University of Southampton Institutional Repository

Rethinking information retrieval in a re-decentralised web: exploring the feasibility and quality of search across personal online datastores

Rethinking information retrieval in a re-decentralised web: exploring the feasibility and quality of search across personal online datastores
Rethinking information retrieval in a re-decentralised web: exploring the feasibility and quality of search across personal online datastores
Traditional information retrieval (IR) models, such as keyword-based and vector-based techniques, have long been used in centralized systems. However, the Web’s re-decentralization, with its focus on data ownership and privacy, calls for a re-evaluation of these methods in these settings. While standards for decentralized search enhance privacy to some extent, they also introduce computational overhead, black-box decision-making, and infrastructure complexity. Despite these challenges, traditional IR techniques remain largely unexplored in such environments. This paper presents an innovative application of traditional IR models in the decentralized Web by adapting them for Personal Online Data Stores (PODs), where search parties have varying access rights. We explore their role in source selection, document ranking, and result merging, extending them to meet decentralized search demands. Using Solid PODs and a synthetic medical dataset, we evaluate these models in a privacy-sensitive environment. Our findings demonstrate that extended IR methods provide an effective balance of performance, interpretability, and efficiency. These approaches hold strong potential as privacy-preserving alternatives for decentralized search on a re-decentralized Web. Notably, our top-performing model achieved competitive results in top-item retrieval compared to centralized search systems, maintaining high relevance scores under both limited and full data access conditions.
1559-114X
Bahrani, Mohammad
e3191e43-22e3-4191-9ec0-a072a3d06c22
Ragab, Mohamed
70b66274-31dc-474c-82a1-f838ad062a14
Oliver, Helen
a8c3c44b-4cd8-40e9-9e65-280f8669e56f
Tiropanis, Thanassis
d06654bd-5513-407b-9acd-6f9b9c5009d8
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Poulovassilis, Alexandra
3b1668fd-3d66-4ea4-aacd-ea75a78fc064
Roussos, George
9d4d00f1-ac33-4b3a-89f1-30c61cc37f3d
Bahrani, Mohammad
e3191e43-22e3-4191-9ec0-a072a3d06c22
Ragab, Mohamed
70b66274-31dc-474c-82a1-f838ad062a14
Oliver, Helen
a8c3c44b-4cd8-40e9-9e65-280f8669e56f
Tiropanis, Thanassis
d06654bd-5513-407b-9acd-6f9b9c5009d8
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Poulovassilis, Alexandra
3b1668fd-3d66-4ea4-aacd-ea75a78fc064
Roussos, George
9d4d00f1-ac33-4b3a-89f1-30c61cc37f3d

Bahrani, Mohammad, Ragab, Mohamed, Oliver, Helen, Tiropanis, Thanassis, Chapman, Adriane, Poulovassilis, Alexandra and Roussos, George (2025) Rethinking information retrieval in a re-decentralised web: exploring the feasibility and quality of search across personal online datastores. ACM Transactions on the Web. (doi:10.1145/3777445).

Record type: Article

Abstract

Traditional information retrieval (IR) models, such as keyword-based and vector-based techniques, have long been used in centralized systems. However, the Web’s re-decentralization, with its focus on data ownership and privacy, calls for a re-evaluation of these methods in these settings. While standards for decentralized search enhance privacy to some extent, they also introduce computational overhead, black-box decision-making, and infrastructure complexity. Despite these challenges, traditional IR techniques remain largely unexplored in such environments. This paper presents an innovative application of traditional IR models in the decentralized Web by adapting them for Personal Online Data Stores (PODs), where search parties have varying access rights. We explore their role in source selection, document ranking, and result merging, extending them to meet decentralized search demands. Using Solid PODs and a synthetic medical dataset, we evaluate these models in a privacy-sensitive environment. Our findings demonstrate that extended IR methods provide an effective balance of performance, interpretability, and efficiency. These approaches hold strong potential as privacy-preserving alternatives for decentralized search on a re-decentralized Web. Notably, our top-performing model achieved competitive results in top-item retrieval compared to centralized search systems, maintaining high relevance scores under both limited and full data access conditions.

Text
TWEB_2025_Final - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (2MB)
Text
3777445 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (1MB)

More information

Accepted/In Press date: 31 October 2025
e-pub ahead of print date: 20 November 2025

Identifiers

Local EPrints ID: 507324
URI: http://eprints.soton.ac.uk/id/eprint/507324
ISSN: 1559-114X
PURE UUID: db388871-0d1b-474d-bb36-277496df596e
ORCID for Mohammad Bahrani: ORCID iD orcid.org/0009-0004-1821-4856
ORCID for Thanassis Tiropanis: ORCID iD orcid.org/0000-0002-6195-2852
ORCID for Adriane Chapman: ORCID iD orcid.org/0000-0002-3814-2587

Catalogue record

Date deposited: 04 Dec 2025 17:49
Last modified: 05 Dec 2025 03:07

Export record

Altmetrics

Contributors

Author: Mohammad Bahrani ORCID iD
Author: Mohamed Ragab
Author: Helen Oliver
Author: Thanassis Tiropanis ORCID iD
Author: Adriane Chapman ORCID iD
Author: Alexandra Poulovassilis
Author: George Roussos

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×