The University of Southampton
University of Southampton Institutional Repository

Automatic key theme extraction in natural language texts

Automatic key theme extraction in natural language texts
Automatic key theme extraction in natural language texts

Information is the most powerful resource available to an organisation. Problems arise when the amount of management needed to effectively organise the mass of information available in data-repositories starts to increase, and reaches a level that is impossible to maintain. Instead of providing value to an organisation the information can serve to confuse and hamper. This research presents the topic of automatic key theme extraction as a method for information management, specifically the extraction of pertinent information from natural language texts. The motivation for this research was to achieve improved accuracy in automatic key theme extraction in natural language texts. The performance was evaluated against an industrial context, which was provided by Active Navigation Ltd, a content management system. The author has produced an architecture for theme extraction using a pipeline of individual processing components that adhere to a lossless information strategy. This loss less architecture has shown that it is capable of providing a higher accuracy of extracting key themes from natural language texts than that of Active Navigation. The accurate extraction of key themes is essential as it provides a solid base for other Active Navigation information navigation tasks. These include advanced search, categorisation, building summaries, finding related documents, and dynamic linking. Improving these navigation techniques increases the effectiveness of the content management system.

University of Southampton
Mo, Bon Yiu
f8efb881-53a3-498a-938b-b37590dc03ef
Mo, Bon Yiu
f8efb881-53a3-498a-938b-b37590dc03ef

Mo, Bon Yiu (2005) Automatic key theme extraction in natural language texts. University of Southampton, Doctoral Thesis.

Record type: Thesis (Doctoral)

Abstract

Information is the most powerful resource available to an organisation. Problems arise when the amount of management needed to effectively organise the mass of information available in data-repositories starts to increase, and reaches a level that is impossible to maintain. Instead of providing value to an organisation the information can serve to confuse and hamper. This research presents the topic of automatic key theme extraction as a method for information management, specifically the extraction of pertinent information from natural language texts. The motivation for this research was to achieve improved accuracy in automatic key theme extraction in natural language texts. The performance was evaluated against an industrial context, which was provided by Active Navigation Ltd, a content management system. The author has produced an architecture for theme extraction using a pipeline of individual processing components that adhere to a lossless information strategy. This loss less architecture has shown that it is capable of providing a higher accuracy of extracting key themes from natural language texts than that of Active Navigation. The accurate extraction of key themes is essential as it provides a solid base for other Active Navigation information navigation tasks. These include advanced search, categorisation, building summaries, finding related documents, and dynamic linking. Improving these navigation techniques increases the effectiveness of the content management system.

Text
1012885.pdf - Version of Record
Available under License University of Southampton Thesis Licence.
Download (1MB)

More information

Published date: 2005

Identifiers

Local EPrints ID: 465889
URI: http://eprints.soton.ac.uk/id/eprint/465889
PURE UUID: 4fd4216e-fa5c-45d3-9b84-974a4463e695

Catalogue record

Date deposited: 05 Jul 2022 03:28
Last modified: 16 Mar 2024 20:25

Export record

Contributors

Author: Bon Yiu Mo

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×