Using Windmill Expansion for Document Retrieval
Using Windmill Expansion for Document Retrieval
SEMIOTIKS aims to utilise online information to support the crucial decision–making of those military and civilian agencies involved in the humanitarian removal of landmines in areas of conflict throughout the world. An analysis of the type of information required for such a task has given rise to four main areas of research: information retrieval, document annotation, summarisation and visualisation. The first stage of the research has focused on information retrieval, and a new algorithm, “Windmill Expansion” (WE) has been proposed to do this. The algorithm uses retrieval feedback techniques for automated query expansion in order to improve the effectiveness of information retrieval. WE is based on the extraction of human–generated written phases for automated query expansion. Top and Second Level expansion terms have been generated and their usefulness evaluated. The evaluation has concentrated on measuring the degree of overlap between the retrieved URLs. The less the overlap, the more useful the information provided. The Top Level expansion terms were found to provide 90% of useful URLs, and the Second Level 83% of useful URLs. Although there was a decline of useful URLs from the Top Level to the Second Level, the quantity of relevant information retrieved has increased. The originality of SEMIOTIKS lies in its use of the WE algorithm to help non–domain specific experts automatically explore domain words for relevant and precise information retrieval.
information retrieval, query expansion, retrieval feedback, humanitarian demining
1-8
Liang, Shao
543554b4-0c4f-4520-a1e6-c084a86df406
Smart, Paul
cd8a3dbf-d963-4009-80fb-76ecc93579df
Russell, Alistair
a9de4cad-5143-4e6c-a7e0-f1667a3abc1d
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
21 April 2009
Liang, Shao
543554b4-0c4f-4520-a1e6-c084a86df406
Smart, Paul
cd8a3dbf-d963-4009-80fb-76ecc93579df
Russell, Alistair
a9de4cad-5143-4e6c-a7e0-f1667a3abc1d
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
Liang, Shao, Smart, Paul, Russell, Alistair and Shadbolt, Nigel
(2009)
Using Windmill Expansion for Document Retrieval.
The Open Information Systems Journal, 3, .
Abstract
SEMIOTIKS aims to utilise online information to support the crucial decision–making of those military and civilian agencies involved in the humanitarian removal of landmines in areas of conflict throughout the world. An analysis of the type of information required for such a task has given rise to four main areas of research: information retrieval, document annotation, summarisation and visualisation. The first stage of the research has focused on information retrieval, and a new algorithm, “Windmill Expansion” (WE) has been proposed to do this. The algorithm uses retrieval feedback techniques for automated query expansion in order to improve the effectiveness of information retrieval. WE is based on the extraction of human–generated written phases for automated query expansion. Top and Second Level expansion terms have been generated and their usefulness evaluated. The evaluation has concentrated on measuring the degree of overlap between the retrieved URLs. The less the overlap, the more useful the information provided. The Top Level expansion terms were found to provide 90% of useful URLs, and the Second Level 83% of useful URLs. Although there was a decline of useful URLs from the Top Level to the Second Level, the quantity of relevant information retrieved has increased. The originality of SEMIOTIKS lies in its use of the WE algorithm to help non–domain specific experts automatically explore domain words for relevant and precise information retrieval.
Text
TOISJ-final.pdf
- Accepted Manuscript
More information
Published date: 21 April 2009
Keywords:
information retrieval, query expansion, retrieval feedback, humanitarian demining
Organisations:
Web & Internet Science, Agents, Interactions & Complexity
Identifiers
Local EPrints ID: 267276
URI: http://eprints.soton.ac.uk/id/eprint/267276
ISSN: 1874-1339
PURE UUID: 25a6951b-334e-4881-b733-4d459a1543ff
Catalogue record
Date deposited: 15 Apr 2009 13:38
Last modified: 15 Mar 2024 03:15
Export record
Contributors
Author:
Shao Liang
Author:
Paul Smart
Author:
Alistair Russell
Author:
Nigel Shadbolt
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics