Investigating sentence weighting components for automatic summarisation
Investigating sentence weighting components for automatic summarisation
The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. The methodology produced subsequently proved to be a reliable indicator of quality for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. We also examined the five automatically generated query terms in their different permutations to check if the automatic generation of query terms resulting bias. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component.
146-153
Liang, SF
22ac6455-24fb-40d7-b9b6-f8ae62f085fd
Devlin, Siobhan
6df7be0f-0bf2-4e0a-989d-aa9a260beb72
Tait, John
af9fd1be-d213-4bec-88ff-e24f36a2c752
5 July 2006
Liang, SF
22ac6455-24fb-40d7-b9b6-f8ae62f085fd
Devlin, Siobhan
6df7be0f-0bf2-4e0a-989d-aa9a260beb72
Tait, John
af9fd1be-d213-4bec-88ff-e24f36a2c752
Liang, SF, Devlin, Siobhan and Tait, John
(2006)
Investigating sentence weighting components for automatic summarisation.
Information Processing & Management, .
(doi:10.1016/j.ipm.2006.05.010).
Abstract
The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. The methodology produced subsequently proved to be a reliable indicator of quality for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. We also examined the five automatically generated query terms in their different permutations to check if the automatic generation of query terms resulting bias. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component.
Text
IP&M_revised-clean.pdf
- Other
More information
Published date: 5 July 2006
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 264982
URI: http://eprints.soton.ac.uk/id/eprint/264982
ISSN: 0306-4573
PURE UUID: 59e3c7d2-e305-4aa5-bdbd-9158aecb95fc
Catalogue record
Date deposited: 18 Dec 2007 14:07
Last modified: 14 Mar 2024 07:59
Export record
Altmetrics
Contributors
Author:
SF Liang
Author:
Siobhan Devlin
Author:
John Tait
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics