PDFOS: PDF estimation based over-sampling for imbalanced two-class problems
PDFOS: PDF estimation based over-sampling for imbalanced two-class problems
This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets
248-259
Gao, Ming
954671c3-4167-48ad-a948-26fa5755cee3
Hong, Xia
e6551bb3-fbc0-4990-935e-43b706d8c679
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Harris, C.J.
c4fd3763-7b3f-4db1-9ca3-5501080f797a
Khalaf, Emad
1ee91105-94c9-4cbf-a565-108aab6ba7ad
22 August 2014
Gao, Ming
954671c3-4167-48ad-a948-26fa5755cee3
Hong, Xia
e6551bb3-fbc0-4990-935e-43b706d8c679
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Harris, C.J.
c4fd3763-7b3f-4db1-9ca3-5501080f797a
Khalaf, Emad
1ee91105-94c9-4cbf-a565-108aab6ba7ad
Gao, Ming, Hong, Xia, Chen, Sheng, Harris, C.J. and Khalaf, Emad
(2014)
PDFOS: PDF estimation based over-sampling for imbalanced two-class problems.
Neurocomputing, 138, .
(doi:10.1016/j.neucom.2014.02.006).
Abstract
This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets
Text
Neurocom2014-Aug.pdf
- Version of Record
Restricted to Repository staff only
Request a copy
More information
Published date: 22 August 2014
Organisations:
Southampton Wireless Group
Identifiers
Local EPrints ID: 364693
URI: http://eprints.soton.ac.uk/id/eprint/364693
ISSN: 0925-2312
PURE UUID: 71eb652f-6498-4bdd-b9b6-146dc17dcb01
Catalogue record
Date deposited: 09 May 2014 10:14
Last modified: 14 Mar 2024 16:39
Export record
Altmetrics
Contributors
Author:
Ming Gao
Author:
Xia Hong
Author:
Sheng Chen
Author:
C.J. Harris
Author:
Emad Khalaf
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics