The use of Hellinger Distance Undersampling model to improve the classification of disease class in imbalanced medical datasets
The use of Hellinger Distance Undersampling model to improve the classification of disease class in imbalanced medical datasets
Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy.
Al-Shamaa, Zina Z R
54c0c372-a0af-4b6e-bdb8-154ac0019bc0
Kurnaz, Sefer
8589c1c8-ba48-400e-b5ae-05526ee034da
Duru, Adil Deniz
6d4707b0-97c0-48cf-8417-df7b9c2adba2
Peppa, Nadia
587ea7eb-6c86-445d-9fcf-eda2985d23b0
Mirnezami, Alex H
0b9bae13-5c03-4645-ad9f-c648b558f51d
Hamady, Zaed Z R
545a1c81-276e-4341-a420-aa10aa5d8ca8
4 November 2020
Al-Shamaa, Zina Z R
54c0c372-a0af-4b6e-bdb8-154ac0019bc0
Kurnaz, Sefer
8589c1c8-ba48-400e-b5ae-05526ee034da
Duru, Adil Deniz
6d4707b0-97c0-48cf-8417-df7b9c2adba2
Peppa, Nadia
587ea7eb-6c86-445d-9fcf-eda2985d23b0
Mirnezami, Alex H
0b9bae13-5c03-4645-ad9f-c648b558f51d
Hamady, Zaed Z R
545a1c81-276e-4341-a420-aa10aa5d8ca8
Al-Shamaa, Zina Z R, Kurnaz, Sefer, Duru, Adil Deniz, Peppa, Nadia, Mirnezami, Alex H and Hamady, Zaed Z R
(2020)
The use of Hellinger Distance Undersampling model to improve the classification of disease class in imbalanced medical datasets.
Applied bionics and biomechanics, 2020, [8824625].
(doi:10.1155/2020/8824625).
Abstract
Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy.
Text
The_use_Hellinger_Distance_VoR
More information
Accepted/In Press date: 5 October 2020
Published date: 4 November 2020
Identifiers
Local EPrints ID: 457796
URI: http://eprints.soton.ac.uk/id/eprint/457796
ISSN: 1176-2322
PURE UUID: 5d6b4071-ad9f-43ae-bb37-4b6927292e6e
Catalogue record
Date deposited: 16 Jun 2022 17:02
Last modified: 17 Mar 2024 04:12
Export record
Altmetrics
Contributors
Author:
Zina Z R Al-Shamaa
Author:
Sefer Kurnaz
Author:
Adil Deniz Duru
Author:
Nadia Peppa
Author:
Alex H Mirnezami
Author:
Zaed Z R Hamady
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics