The University of Southampton
University of Southampton Institutional Repository

Towards knowledge discovery from WWW log data

Towards knowledge discovery from WWW log data
Towards knowledge discovery from WWW log data
As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better customer services and site performance. Different to most of the existing log analysis tools which use statistical counting summaries on pages, hosts, etc., we propose a transaction model to represent users access history and a framework to adapt data mining techniques such as sequence and association rule mining to these transactions. In this framework, all transactions are extracted from the raw log file though a series of step by step data preparation phases. We discuss different methods to identify a user, and separate long convoluted sequences into semantically meaningful sessions and transactions. A new feature called interestingness is defined to model user interests in different web sections. With all the transactions being imported into an adapted cube structure with a concept hierarchy attached to each dimension of it, it is possible to carry out multi-dimensional data mining at multi-abstract levels. Using interest context rules, we demonstrate the potentially significant meaning of this system prototype.
Data Mining, Knowledge discovery, Web log mining
302-307
Tao, Feng
3d9fc416-da70-4ee2-87c4-6ba0a1d26461
Murtagh, Fionn
b1a5f04b-d373-4403-9d29-73273f1e6ce9
Tao, Feng
3d9fc416-da70-4ee2-87c4-6ba0a1d26461
Murtagh, Fionn
b1a5f04b-d373-4403-9d29-73273f1e6ce9

Tao, Feng and Murtagh, Fionn (2000) Towards knowledge discovery from WWW log data. IEEE International Conference on Information Technology: Coding and Computing, Las Vegas, United States. 26 - 28 Mar 2000. pp. 302-307 .

Record type: Conference or Workshop Item (Other)

Abstract

As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better customer services and site performance. Different to most of the existing log analysis tools which use statistical counting summaries on pages, hosts, etc., we propose a transaction model to represent users access history and a framework to adapt data mining techniques such as sequence and association rule mining to these transactions. In this framework, all transactions are extracted from the raw log file though a series of step by step data preparation phases. We discuss different methods to identify a user, and separate long convoluted sequences into semantically meaningful sessions and transactions. A new feature called interestingness is defined to model user interests in different web sections. With all the transactions being imported into an adapted cube structure with a concept hierarchy attached to each dimension of it, it is possible to carry out multi-dimensional data mining at multi-abstract levels. Using interest context rules, we demonstrate the potentially significant meaning of this system prototype.

This record has no associated files available for download.

More information

Published date: 2000
Additional Information: Event Dates: March 27-29, 2000
Venue - Dates: IEEE International Conference on Information Technology: Coding and Computing, Las Vegas, United States, 2000-03-26 - 2000-03-28
Keywords: Data Mining, Knowledge discovery, Web log mining
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 258153
URI: http://eprints.soton.ac.uk/id/eprint/258153
PURE UUID: e9fbeb01-e44d-452c-889c-1a2f88b91bc7

Catalogue record

Date deposited: 22 Oct 2003
Last modified: 07 Jan 2022 21:11

Export record

Contributors

Author: Feng Tao
Author: Fionn Murtagh

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×