Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics
Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics
Data Mining has been used in the healthcare domain for diagnosis and treatment analysis, resource management and fraud detection. It brings a set of tools and techniques that can be applied to large-scale patient data to discover underlying patterns and provide healthcare professionals an additional source of knowledge for making decisions. The Southampton Breast Cancer Data System (SBCDS) containing some 16,000 timeline-structured records is a visually rich and highly intuitive system for the manual and automated transfer of demographic, pathology and treatment data into an episode-based structure. While expansion of the data mining capability in SBCDS is one of the objectives of our research, real-world patient data is generally incomplete, inconsistent and containing errors. This case study will focus on the data pre-processing stage in order to clean the raw data and prepare the final dataset for use in data mining and analytics. Some initial results are given for sequential patterns mining and classification which highlight the advantages of the approach.
breast cancer data, data mining, electronic patient records, Health informatics, pre-processing
64-67
Lu, Jing
51addc48-28e4-4a31-b68b-62d4d77c4c32
Hales, Alan
66a20906-7b0e-4d23-b65a-08932f23900b
Rew, David
36dcc3ad-2379-4b61-a468-5c623d796887
Keech, Malcolm
6aa13471-c162-40b8-9363-c85a8356331d
20 June 2016
Lu, Jing
51addc48-28e4-4a31-b68b-62d4d77c4c32
Hales, Alan
66a20906-7b0e-4d23-b65a-08932f23900b
Rew, David
36dcc3ad-2379-4b61-a468-5c623d796887
Keech, Malcolm
6aa13471-c162-40b8-9363-c85a8356331d
Lu, Jing, Hales, Alan, Rew, David and Keech, Malcolm
(2016)
Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics.
In 2016 IEEE 32nd International Conference on Data Engineering Workshops, ICDEW 2016.
IEEE.
.
(doi:10.1109/ICDEW.2016.7495618).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Data Mining has been used in the healthcare domain for diagnosis and treatment analysis, resource management and fraud detection. It brings a set of tools and techniques that can be applied to large-scale patient data to discover underlying patterns and provide healthcare professionals an additional source of knowledge for making decisions. The Southampton Breast Cancer Data System (SBCDS) containing some 16,000 timeline-structured records is a visually rich and highly intuitive system for the manual and automated transfer of demographic, pathology and treatment data into an episode-based structure. While expansion of the data mining capability in SBCDS is one of the objectives of our research, real-world patient data is generally incomplete, inconsistent and containing errors. This case study will focus on the data pre-processing stage in order to clean the raw data and prepare the final dataset for use in data mining and analytics. Some initial results are given for sequential patterns mining and classification which highlight the advantages of the approach.
This record has no associated files available for download.
More information
Published date: 20 June 2016
Additional Information:
Publisher Copyright:
© 2016 IEEE.
Venue - Dates:
32nd IEEE International Conference on Data Engineering Workshops, ICDEW 2016, , Helsinki, Finland, 2016-05-16 - 2016-05-20
Keywords:
breast cancer data, data mining, electronic patient records, Health informatics, pre-processing
Identifiers
Local EPrints ID: 447596
URI: http://eprints.soton.ac.uk/id/eprint/447596
PURE UUID: eff00d50-1e0a-4aee-a171-2d8a3973fafc
Catalogue record
Date deposited: 16 Mar 2021 17:45
Last modified: 30 Apr 2024 01:54
Export record
Altmetrics
Contributors
Author:
Jing Lu
Author:
Alan Hales
Author:
Malcolm Keech
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics