The University of Southampton
University of Southampton Institutional Repository

Using Windmill Expansion for Document Retrieval

Using Windmill Expansion for Document Retrieval
Using Windmill Expansion for Document Retrieval
SEMIOTIKS aims to utilise online information to support the crucial decision–making of those military and civilian agencies involved in the humanitarian removal of landmines in areas of conflict throughout the world. An analysis of the type of information required for such a task has given rise to four main areas of research: information retrieval, document annotation, summarisation and visualisation. The first stage of the research has focused on information retrieval, and a new algorithm, “Windmill Expansion” (WE) has been proposed to do this. The algorithm uses retrieval feedback techniques for automated query expansion in order to improve the effectiveness of information retrieval. WE is based on the extraction of human–generated written phases for automated query expansion. Top and Second Level expansion terms have been generated and their usefulness evaluated. The evaluation has concentrated on measuring the degree of overlap between the retrieved URLs. The less the overlap, the more useful the information provided. The Top Level expansion terms were found to provide 90% of useful URLs, and the Second Level 83% of useful URLs. Although there was a decline of useful URLs from the Top Level to the Second Level, the quantity of relevant information retrieved has increased. The originality of SEMIOTIKS lies in its use of the WE algorithm to help non–domain specific experts automatically explore domain words for relevant and precise information retrieval.
information retrieval, query expansion, retrieval feedback, humanitarian demining
1874-1339
1-8
Liang, Shao
543554b4-0c4f-4520-a1e6-c084a86df406
Smart, Paul
cd8a3dbf-d963-4009-80fb-76ecc93579df
Russell, Alistair
a9de4cad-5143-4e6c-a7e0-f1667a3abc1d
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
Liang, Shao
543554b4-0c4f-4520-a1e6-c084a86df406
Smart, Paul
cd8a3dbf-d963-4009-80fb-76ecc93579df
Russell, Alistair
a9de4cad-5143-4e6c-a7e0-f1667a3abc1d
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7

Liang, Shao, Smart, Paul, Russell, Alistair and Shadbolt, Nigel (2009) Using Windmill Expansion for Document Retrieval. The Open Information Systems Journal, 3, 1-8.

Record type: Article

Abstract

SEMIOTIKS aims to utilise online information to support the crucial decision–making of those military and civilian agencies involved in the humanitarian removal of landmines in areas of conflict throughout the world. An analysis of the type of information required for such a task has given rise to four main areas of research: information retrieval, document annotation, summarisation and visualisation. The first stage of the research has focused on information retrieval, and a new algorithm, “Windmill Expansion” (WE) has been proposed to do this. The algorithm uses retrieval feedback techniques for automated query expansion in order to improve the effectiveness of information retrieval. WE is based on the extraction of human–generated written phases for automated query expansion. Top and Second Level expansion terms have been generated and their usefulness evaluated. The evaluation has concentrated on measuring the degree of overlap between the retrieved URLs. The less the overlap, the more useful the information provided. The Top Level expansion terms were found to provide 90% of useful URLs, and the Second Level 83% of useful URLs. Although there was a decline of useful URLs from the Top Level to the Second Level, the quantity of relevant information retrieved has increased. The originality of SEMIOTIKS lies in its use of the WE algorithm to help non–domain specific experts automatically explore domain words for relevant and precise information retrieval.

Text
TOISJ-final.pdf - Accepted Manuscript
Download (182kB)

More information

Published date: 21 April 2009
Keywords: information retrieval, query expansion, retrieval feedback, humanitarian demining
Organisations: Web & Internet Science, Agents, Interactions & Complexity

Identifiers

Local EPrints ID: 267276
URI: http://eprints.soton.ac.uk/id/eprint/267276
ISSN: 1874-1339
PURE UUID: 25a6951b-334e-4881-b733-4d459a1543ff
ORCID for Paul Smart: ORCID iD orcid.org/0000-0001-9989-5307

Catalogue record

Date deposited: 15 Apr 2009 13:38
Last modified: 15 Mar 2024 03:15

Export record

Contributors

Author: Shao Liang
Author: Paul Smart ORCID iD
Author: Alistair Russell
Author: Nigel Shadbolt

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×