The University of Southampton
University of Southampton Institutional Repository

Everything you always wanted to know about a dataset: studies in data summarisation

Everything you always wanted to know about a dataset: studies in data summarisation
Everything you always wanted to know about a dataset: studies in data summarisation
Summarising data as text helps people make sense of it. It also improves data discovery, as search algorithms can match this text against keyword queries. In this paper, we explore the characteristics of text summaries of data in order to understand how meaningful summaries look like. We present two complementary studies: a data-search diary study with 69 students, which offers insight into the information needs of people searching for data; and a summarisation study, with a lab and a crowdsourcing component with overall 80 data-literate participants, which produced summaries for 25 datasets. In each study we carried out a qualitative analysis to identify key themes and commonly mentioned dataset attributes, which people consider when searching and making sense of data. The results helped us design a template to create more meaningful textual representations of data, alongside guidelines for improving data-search experience overall.
1071-5819
Koesten, Laura
79e66d1b-2d8f-43df-a39b-60bc7749fb22
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Kacprzak, Emilia, Magdalena
fdc38ad7-6879-4769-ad65-5d3582690af2
Blount, Thomas
4d4db315-08d9-4701-9604-1e99c60879fb
Tennison, Jeni
abfdd103-6089-427d-babb-56448595f2fa
Koesten, Laura
79e66d1b-2d8f-43df-a39b-60bc7749fb22
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Kacprzak, Emilia, Magdalena
fdc38ad7-6879-4769-ad65-5d3582690af2
Blount, Thomas
4d4db315-08d9-4701-9604-1e99c60879fb
Tennison, Jeni
abfdd103-6089-427d-babb-56448595f2fa

Koesten, Laura, Simperl, Elena, Kacprzak, Emilia, Magdalena, Blount, Thomas and Tennison, Jeni (2020) Everything you always wanted to know about a dataset: studies in data summarisation. International Journal of Human-Computer Studies, 135. (doi:10.1016/j.ijhcs.2019.10.004).

Record type: Article

Abstract

Summarising data as text helps people make sense of it. It also improves data discovery, as search algorithms can match this text against keyword queries. In this paper, we explore the characteristics of text summaries of data in order to understand how meaningful summaries look like. We present two complementary studies: a data-search diary study with 69 students, which offers insight into the information needs of people searching for data; and a summarisation study, with a lab and a crowdsourcing component with overall 80 data-literate participants, which produced summaries for 25 datasets. In each study we carried out a qualitative analysis to identify key themes and commonly mentioned dataset attributes, which people consider when searching and making sense of data. The results helped us design a template to create more meaningful textual representations of data, alongside guidelines for improving data-search experience overall.

Text
Everything you always wanted to know about a dataset_studies in data summarisation - Accepted Manuscript
Download (3MB)

More information

Accepted/In Press date: 14 October 2019
e-pub ahead of print date: 14 October 2019
Published date: March 2020

Identifiers

Local EPrints ID: 436246
URI: http://eprints.soton.ac.uk/id/eprint/436246
ISSN: 1071-5819
PURE UUID: 23d95d48-3905-4f61-a79b-0248b5a98718
ORCID for Elena Simperl: ORCID iD orcid.org/0000-0003-1722-947X
ORCID for Thomas Blount: ORCID iD orcid.org/0000-0002-4879-5012

Catalogue record

Date deposited: 04 Dec 2019 17:30
Last modified: 18 Feb 2021 17:31

Export record

Altmetrics

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×