The University of Southampton
University of Southampton Institutional Repository

A comparison of dataset search behaviour of internal versus search engine referred sessions

A comparison of dataset search behaviour of internal versus search engine referred sessions
A comparison of dataset search behaviour of internal versus search engine referred sessions
storytelling to labelling for supervised machine learning. Previous qualitative research suggests that people use two types of search affordances to find the data they need: they either go to a data portal that probably contains the data and search there; or they start on a regular web search engine, which sometimes returns results that are datasets. For the first type of search, prior works have analysed logs from different data portals to understand basic tenets of search behaviour such as query length or topics. In this paper, we advance the state of the art in dataset search behaviour with a comprehensive transaction log analysis study (n = 236441 sessions) of an international open data portal, in which we compare sessions straight on a data portal (internal searches) against sessions that land on a dataset or SERP (search engine result page) through a referral from a web search engine (external). Using dataset downloads as a proxy for successful searches, we find a statistically significant, though weak relationship between the use of keyword search and session type and between the use of search facets and session type (moderate). We also discover and discuss behavioural patterns and user profiles across session types.
Association for Computing Machinery
Ibanez Gonzalez, Luis
65a2e20b-74a9-427d-8c4c-2330285153ed
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Ibanez Gonzalez, Luis
65a2e20b-74a9-427d-8c4c-2330285153ed
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67

Ibanez Gonzalez, Luis and Simperl, Elena (2022) A comparison of dataset search behaviour of internal versus search engine referred sessions. In ACM SIGIR Conference on Human Information Interaction and Retrieval. Association for Computing Machinery.. (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

storytelling to labelling for supervised machine learning. Previous qualitative research suggests that people use two types of search affordances to find the data they need: they either go to a data portal that probably contains the data and search there; or they start on a regular web search engine, which sometimes returns results that are datasets. For the first type of search, prior works have analysed logs from different data portals to understand basic tenets of search behaviour such as query length or topics. In this paper, we advance the state of the art in dataset search behaviour with a comprehensive transaction log analysis study (n = 236441 sessions) of an international open data portal, in which we compare sessions straight on a data portal (internal searches) against sessions that land on a dataset or SERP (search engine result page) through a referral from a web search engine (external). Using dataset downloads as a proxy for successful searches, we find a statistically significant, though weak relationship between the use of keyword search and session type and between the use of search facets and session type (moderate). We also discover and discuss behavioural patterns and user profiles across session types.

This record has no associated files available for download.

More information

Accepted/In Press date: 9 January 2022

Identifiers

Local EPrints ID: 454678
URI: http://eprints.soton.ac.uk/id/eprint/454678
PURE UUID: c8f28527-0cb0-46d6-814a-9eb87a104510
ORCID for Luis Ibanez Gonzalez: ORCID iD orcid.org/0000-0001-6993-0001
ORCID for Elena Simperl: ORCID iD orcid.org/0000-0003-1722-947X

Catalogue record

Date deposited: 21 Feb 2022 17:33
Last modified: 09 Dec 2023 02:47

Export record

Contributors

Author: Luis Ibanez Gonzalez ORCID iD
Author: Elena Simperl ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×