The University of Southampton
University of Southampton Institutional Repository

Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics

Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics
Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics

Data Mining has been used in the healthcare domain for diagnosis and treatment analysis, resource management and fraud detection. It brings a set of tools and techniques that can be applied to large-scale patient data to discover underlying patterns and provide healthcare professionals an additional source of knowledge for making decisions. The Southampton Breast Cancer Data System (SBCDS) containing some 16,000 timeline-structured records is a visually rich and highly intuitive system for the manual and automated transfer of demographic, pathology and treatment data into an episode-based structure. While expansion of the data mining capability in SBCDS is one of the objectives of our research, real-world patient data is generally incomplete, inconsistent and containing errors. This case study will focus on the data pre-processing stage in order to clean the raw data and prepare the final dataset for use in data mining and analytics. Some initial results are given for sequential patterns mining and classification which highlight the advantages of the approach.

breast cancer data, data mining, electronic patient records, Health informatics, pre-processing
64-67
IEEE
Lu, Jing
51addc48-28e4-4a31-b68b-62d4d77c4c32
Hales, Alan
66a20906-7b0e-4d23-b65a-08932f23900b
Rew, David
36dcc3ad-2379-4b61-a468-5c623d796887
Keech, Malcolm
6aa13471-c162-40b8-9363-c85a8356331d
Lu, Jing
51addc48-28e4-4a31-b68b-62d4d77c4c32
Hales, Alan
66a20906-7b0e-4d23-b65a-08932f23900b
Rew, David
36dcc3ad-2379-4b61-a468-5c623d796887
Keech, Malcolm
6aa13471-c162-40b8-9363-c85a8356331d

Lu, Jing, Hales, Alan, Rew, David and Keech, Malcolm (2016) Timeline and episode-structured clinical data: Pre-processing for Data Mining and analytics. In 2016 IEEE 32nd International Conference on Data Engineering Workshops, ICDEW 2016. IEEE. pp. 64-67 . (doi:10.1109/ICDEW.2016.7495618).

Record type: Conference or Workshop Item (Paper)

Abstract

Data Mining has been used in the healthcare domain for diagnosis and treatment analysis, resource management and fraud detection. It brings a set of tools and techniques that can be applied to large-scale patient data to discover underlying patterns and provide healthcare professionals an additional source of knowledge for making decisions. The Southampton Breast Cancer Data System (SBCDS) containing some 16,000 timeline-structured records is a visually rich and highly intuitive system for the manual and automated transfer of demographic, pathology and treatment data into an episode-based structure. While expansion of the data mining capability in SBCDS is one of the objectives of our research, real-world patient data is generally incomplete, inconsistent and containing errors. This case study will focus on the data pre-processing stage in order to clean the raw data and prepare the final dataset for use in data mining and analytics. Some initial results are given for sequential patterns mining and classification which highlight the advantages of the approach.

This record has no associated files available for download.

More information

Published date: 20 June 2016
Additional Information: Publisher Copyright: © 2016 IEEE.
Venue - Dates: 32nd IEEE International Conference on Data Engineering Workshops, ICDEW 2016, , Helsinki, Finland, 2016-05-16 - 2016-05-20
Keywords: breast cancer data, data mining, electronic patient records, Health informatics, pre-processing

Identifiers

Local EPrints ID: 447596
URI: http://eprints.soton.ac.uk/id/eprint/447596
PURE UUID: eff00d50-1e0a-4aee-a171-2d8a3973fafc
ORCID for David Rew: ORCID iD orcid.org/0000-0002-4518-2667

Catalogue record

Date deposited: 16 Mar 2021 17:45
Last modified: 30 Apr 2024 01:54

Export record

Altmetrics

Contributors

Author: Jing Lu
Author: Alan Hales
Author: David Rew ORCID iD
Author: Malcolm Keech

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×