A study on the application of data mining techniques for classification and clustering of medical data
A study on the application of data mining techniques for classification and clustering of medical data
Data mining is the science of extracting nontrivial, previously unsuspected and finally comprehensible information from large databases and applying it for decisions making. This new discipline plays an essential role in exploring and interpreting massive medical data sets. This paper is concerned with the application of data mining techniques to the analysis of the trauma annual data in Greece for the year 2005. The data set consists of 6334 records, 25 variables and a binary response variable (death or not). In our study, different data mining techniques are implemented and decision trees, classification rules and clusters are produced. The results of C&RT, CHAID, C5.0 and QUEST are evaluated not only before but also after the implementation of feature selection methods in the examined data set. For clustering, EM and K-means algorithms are used to identify valuable clusters of records.
1-14
Koukouvinos, Christos
9c88d32d-b519-4d78-a60d-66418cab1926
Massou, Efthalia
49618a7f-1f3b-454e-a52c-51522a0a0763
Mylona, Kalliopi
b44af287-2d9f-4df8-931c-32d8ab117864
May 2010
Koukouvinos, Christos
9c88d32d-b519-4d78-a60d-66418cab1926
Massou, Efthalia
49618a7f-1f3b-454e-a52c-51522a0a0763
Mylona, Kalliopi
b44af287-2d9f-4df8-931c-32d8ab117864
Koukouvinos, Christos, Massou, Efthalia and Mylona, Kalliopi
(2010)
A study on the application of data mining techniques for classification and clustering of medical data.
Journal of Applied Probability & Statistics, 5 (1), .
Abstract
Data mining is the science of extracting nontrivial, previously unsuspected and finally comprehensible information from large databases and applying it for decisions making. This new discipline plays an essential role in exploring and interpreting massive medical data sets. This paper is concerned with the application of data mining techniques to the analysis of the trauma annual data in Greece for the year 2005. The data set consists of 6334 records, 25 variables and a binary response variable (death or not). In our study, different data mining techniques are implemented and decision trees, classification rules and clusters are produced. The results of C&RT, CHAID, C5.0 and QUEST are evaluated not only before but also after the implementation of feature selection methods in the examined data set. For clustering, EM and K-means algorithms are used to identify valuable clusters of records.
This record has no associated files available for download.
More information
Published date: May 2010
Organisations:
Statistics
Identifiers
Local EPrints ID: 336773
URI: http://eprints.soton.ac.uk/id/eprint/336773
ISSN: 1930-6792
PURE UUID: a50dbbca-8d54-4408-9b70-c959a84e9bba
Catalogue record
Date deposited: 04 Apr 2012 15:45
Last modified: 11 Dec 2021 00:04
Export record
Contributors
Author:
Christos Koukouvinos
Author:
Efthalia Massou
Author:
Kalliopi Mylona
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics