IDN-Sum: Novel dataset for interactive digital narrative extractive text summarisation
IDN-Sum: Novel dataset for interactive digital narrative extractive text summarisation
Summarizing Interactive Digital Narratives (IDN) presents some unique challenges to existing text summarization models especially around capturing interactive elements in addition to important plot points. In this paper we describe the first IDN dataset (IDN-Sum) designed specifically for training and testing IDN text summarization algorithms. Our dataset is generated using random playthroughs of 8 IDN episodes, taken from 2 different IDN games, and consists of 10,000 documents. Playthrough documents are annotated through automatic alignment with fan-sourced summaries using a commonly used alignment algorithm. The dataset is released as open source for future researchers to train and test their own approaches for IDN text.
University of Southampton
Revi, Ashwathy Thattamparambil
c252029f-823b-437b-8c5e-b67878474aa3
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Millard, David
4f19bca5-80dc-4533-a101-89a5a0e3b372
Revi, Ashwathy Thattamparambil
c252029f-823b-437b-8c5e-b67878474aa3
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Millard, David
4f19bca5-80dc-4533-a101-89a5a0e3b372
Revi, Ashwathy Thattamparambil
(2022)
IDN-Sum: Novel dataset for interactive digital narrative extractive text summarisation.
University of Southampton
doi:10.5281/zenodo.7083149
[Dataset]
Abstract
Summarizing Interactive Digital Narratives (IDN) presents some unique challenges to existing text summarization models especially around capturing interactive elements in addition to important plot points. In this paper we describe the first IDN dataset (IDN-Sum) designed specifically for training and testing IDN text summarization algorithms. Our dataset is generated using random playthroughs of 8 IDN episodes, taken from 2 different IDN games, and consists of 10,000 documents. Playthrough documents are annotated through automatic alignment with fan-sourced summaries using a commonly used alignment algorithm. The dataset is released as open source for future researchers to train and test their own approaches for IDN text.
This record has no associated files available for download.
More information
Published date: 15 September 2022
Identifiers
Local EPrints ID: 469877
URI: http://eprints.soton.ac.uk/id/eprint/469877
PURE UUID: 97aa10e2-46dc-4df2-9d8b-58600cdb919d
Catalogue record
Date deposited: 27 Sep 2022 17:12
Last modified: 06 May 2023 02:00
Export record
Altmetrics
Contributors
Creator:
Ashwathy Thattamparambil Revi
Research team head:
David Millard
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics