The University of Southampton
University of Southampton Institutional Repository

The horse before the cart: improving the accuracy of taxonomic directions when building tag hierarchies

The horse before the cart: improving the accuracy of taxonomic directions when building tag hierarchies
The horse before the cart: improving the accuracy of taxonomic directions when building tag hierarchies
Content on the Web is huge and constantly growing, and building taxonomies for such content can help with navigation and organisation, but building taxonomies manually is costly and time-consuming. An alternative is to allow users to construct folksonomies: collective social classifications. Yet, folksonomies are inconsistent and their use for searching and browsing is limited. Approaches have been suggested for acquiring implicit hierarchical structures from folksonomies, however, but these approaches suffer from the ‘popularity-generality’ problem, in that popularity is assumed to be a proxy for generality, i.e. high-level taxonomic terms will occur more often than low-level ones. To tackle this problem, we propose in this paper an improved approach. It is based on the Heymann–Benz algorithm, and works by checking the taxonomic directions against a corpus of text. Our results show that popularity works as a proxy for generality in at most 90.91% of cases, but this can be improved to 95.45% using our approach, which should translate to higher-quality tag hierarchy structures
978-1-78326-914-3
297-310
Almoqhim, Fahad
c1aba1d3-e4fe-4298-9193-54d15292e286
Millard, David E.
4f19bca5-80dc-4533-a101-89a5a0e3b372
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
Almoqhim, Fahad
c1aba1d3-e4fe-4298-9193-54d15292e286
Millard, David E.
4f19bca5-80dc-4533-a101-89a5a0e3b372
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7

Almoqhim, Fahad, Millard, David E. and Shadbolt, Nigel (2016) The horse before the cart: improving the accuracy of taxonomic directions when building tag hierarchies. The 8th Saudi Students Conference in the UK. 31 Jan - 01 Feb 2015. pp. 297-310 . (doi:10.1142/9781783269150_0026).

Record type: Conference or Workshop Item (Paper)

Abstract

Content on the Web is huge and constantly growing, and building taxonomies for such content can help with navigation and organisation, but building taxonomies manually is costly and time-consuming. An alternative is to allow users to construct folksonomies: collective social classifications. Yet, folksonomies are inconsistent and their use for searching and browsing is limited. Approaches have been suggested for acquiring implicit hierarchical structures from folksonomies, however, but these approaches suffer from the ‘popularity-generality’ problem, in that popularity is assumed to be a proxy for generality, i.e. high-level taxonomic terms will occur more often than low-level ones. To tackle this problem, we propose in this paper an improved approach. It is based on the Heymann–Benz algorithm, and works by checking the taxonomic directions against a corpus of text. Our results show that popularity works as a proxy for generality in at most 90.91% of cases, but this can be improved to 95.45% using our approach, which should translate to higher-quality tag hierarchy structures

Text
Improving the Accuracy of Taxonomic Directions When Building Tag Hierarchies.pdf - Other
Download (372kB)

More information

e-pub ahead of print date: 15 December 2015
Published date: February 2016
Venue - Dates: The 8th Saudi Students Conference in the UK, 2015-01-31 - 2015-02-01
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 382247
URI: http://eprints.soton.ac.uk/id/eprint/382247
ISBN: 978-1-78326-914-3
PURE UUID: 9e8f7925-a771-47d8-8f0d-6d80ee18add3
ORCID for David E. Millard: ORCID iD orcid.org/0000-0002-7512-2710

Catalogue record

Date deposited: 02 Nov 2015 10:00
Last modified: 15 Mar 2024 02:59

Export record

Altmetrics

Contributors

Author: Fahad Almoqhim
Author: David E. Millard ORCID iD
Author: Nigel Shadbolt

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×