The University of Southampton
University of Southampton Institutional Repository

PDFOS: PDF estimation based over-sampling for imbalanced two-class problems

PDFOS: PDF estimation based over-sampling for imbalanced two-class problems
PDFOS: PDF estimation based over-sampling for imbalanced two-class problems
This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets
0925-2312
248-259
Gao, Ming
954671c3-4167-48ad-a948-26fa5755cee3
Hong, Xia
e6551bb3-fbc0-4990-935e-43b706d8c679
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Harris, C.J.
c4fd3763-7b3f-4db1-9ca3-5501080f797a
Khalaf, Emad
1ee91105-94c9-4cbf-a565-108aab6ba7ad
Gao, Ming
954671c3-4167-48ad-a948-26fa5755cee3
Hong, Xia
e6551bb3-fbc0-4990-935e-43b706d8c679
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Harris, C.J.
c4fd3763-7b3f-4db1-9ca3-5501080f797a
Khalaf, Emad
1ee91105-94c9-4cbf-a565-108aab6ba7ad

Gao, Ming, Hong, Xia, Chen, Sheng, Harris, C.J. and Khalaf, Emad (2014) PDFOS: PDF estimation based over-sampling for imbalanced two-class problems. Neurocomputing, 138, 248-259. (doi:10.1016/j.neucom.2014.02.006).

Record type: Article

Abstract

This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets

Text
Neurocom2014-Aug.pdf - Version of Record
Restricted to Repository staff only
Request a copy

More information

Published date: 22 August 2014
Organisations: Southampton Wireless Group

Identifiers

Local EPrints ID: 364693
URI: http://eprints.soton.ac.uk/id/eprint/364693
ISSN: 0925-2312
PURE UUID: 71eb652f-6498-4bdd-b9b6-146dc17dcb01

Catalogue record

Date deposited: 09 May 2014 10:14
Last modified: 25 Nov 2019 20:43

Export record

Altmetrics

Contributors

Author: Ming Gao
Author: Xia Hong
Author: Sheng Chen
Author: C.J. Harris
Author: Emad Khalaf

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×