Automatic key theme extraction in natural language texts

Mo, Bon Yiu (2005) Automatic key theme extraction in natural language texts. University of Southampton, Doctoral Thesis.

Record type: Thesis (Doctoral)

Abstract

Information is the most powerful resource available to an organisation. Problems arise when the amount of management needed to effectively organise the mass of information available in data-repositories starts to increase, and reaches a level that is impossible to maintain. Instead of providing value to an organisation the information can serve to confuse and hamper. This research presents the topic of automatic key theme extraction as a method for information management, specifically the extraction of pertinent information from natural language texts. The motivation for this research was to achieve improved accuracy in automatic key theme extraction in natural language texts. The performance was evaluated against an industrial context, which was provided by Active Navigation Ltd, a content management system. The author has produced an architecture for theme extraction using a pipeline of individual processing components that adhere to a lossless information strategy. This loss less architecture has shown that it is capable of providing a higher accuracy of extracting key themes from natural language texts than that of Active Navigation. The accurate extraction of key themes is essential as it provides a solid base for other Active Navigation information navigation tasks. These include advanced search, categorisation, building summaries, finding related documents, and dynamic linking. Improving these navigation techniques increases the effectiveness of the content management system.

Text

1012885.pdf - Version of Record

Available under License University of Southampton Thesis Licence.

Download (1MB)