Towards knowledge discovery from WWW log data
Towards knowledge discovery from WWW log data
As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better customer services and site performance. Different to most of the existing log analysis tools which use statistical counting summaries on pages, hosts, etc., we propose a transaction model to represent users access history and a framework to adapt data mining techniques such as sequence and association rule mining to these transactions. In this framework, all transactions are extracted from the raw log file though a series of step by step data preparation phases. We discuss different methods to identify a user, and separate long convoluted sequences into semantically meaningful sessions and transactions. A new feature called interestingness is defined to model user interests in different web sections. With all the transactions being imported into an adapted cube structure with a concept hierarchy attached to each dimension of it, it is possible to carry out multi-dimensional data mining at multi-abstract levels. Using interest context rules, we demonstrate the potentially significant meaning of this system prototype.
Data Mining, Knowledge discovery, Web log mining
302-307
Tao, Feng
3d9fc416-da70-4ee2-87c4-6ba0a1d26461
Murtagh, Fionn
b1a5f04b-d373-4403-9d29-73273f1e6ce9
2000
Tao, Feng
3d9fc416-da70-4ee2-87c4-6ba0a1d26461
Murtagh, Fionn
b1a5f04b-d373-4403-9d29-73273f1e6ce9
Tao, Feng and Murtagh, Fionn
(2000)
Towards knowledge discovery from WWW log data.
IEEE International Conference on Information Technology: Coding and Computing, Las Vegas, United States.
26 - 28 Mar 2000.
.
Record type:
Conference or Workshop Item
(Other)
Abstract
As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better customer services and site performance. Different to most of the existing log analysis tools which use statistical counting summaries on pages, hosts, etc., we propose a transaction model to represent users access history and a framework to adapt data mining techniques such as sequence and association rule mining to these transactions. In this framework, all transactions are extracted from the raw log file though a series of step by step data preparation phases. We discuss different methods to identify a user, and separate long convoluted sequences into semantically meaningful sessions and transactions. A new feature called interestingness is defined to model user interests in different web sections. With all the transactions being imported into an adapted cube structure with a concept hierarchy attached to each dimension of it, it is possible to carry out multi-dimensional data mining at multi-abstract levels. Using interest context rules, we demonstrate the potentially significant meaning of this system prototype.
This record has no associated files available for download.
More information
Published date: 2000
Additional Information:
Event Dates: March 27-29, 2000
Venue - Dates:
IEEE International Conference on Information Technology: Coding and Computing, Las Vegas, United States, 2000-03-26 - 2000-03-28
Keywords:
Data Mining, Knowledge discovery, Web log mining
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 258153
URI: http://eprints.soton.ac.uk/id/eprint/258153
PURE UUID: e9fbeb01-e44d-452c-889c-1a2f88b91bc7
Catalogue record
Date deposited: 22 Oct 2003
Last modified: 07 Jan 2022 21:11
Export record
Contributors
Author:
Feng Tao
Author:
Fionn Murtagh
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics