Cost-based feature selection for Support Vector Machines: an application in credit scoring
Cost-based feature selection for Support Vector Machines: an application in credit scoring
In this work we propose two formulations based on Support Vector Machines for simultaneous classification and feature selection that explicitly incorporate attribute acquisition costs. This is a challenging task for two main reasons: the estimation of the acquisition costs is not straightforward and may depend on multivariate factors, and the inter-dependence between variables must be taken into account for the modelling process since companies usually acquire groups of related variables rather than acquiring them individually. Mixed-integer linear programming models are proposed for constructing classifiers that constrain acquisition costs while classifying adequately. Experimental results using credit scoring datasets demonstrate the effectiveness of our methods in terms of predictive performance at a low cost compared to well-known feature selection approaches.
656–665
Maldonado, Sebastián
9e5fb121-d905-4337-beb3-bba6f7da9ae2
Pérez, Juan
6f8b9b90-b3e6-4b03-b444-35e32d9fb3f9
Bravo, Cristian
b22c4145-644e-40ee-85d8-431c59c3c71b
1 September 2017
Maldonado, Sebastián
9e5fb121-d905-4337-beb3-bba6f7da9ae2
Pérez, Juan
6f8b9b90-b3e6-4b03-b444-35e32d9fb3f9
Bravo, Cristian
b22c4145-644e-40ee-85d8-431c59c3c71b
Maldonado, Sebastián, Pérez, Juan and Bravo, Cristian
(2017)
Cost-based feature selection for Support Vector Machines: an application in credit scoring.
European Journal of Operational Research, 261 (2), .
(doi:10.1016/j.ejor.2017.02.037).
Abstract
In this work we propose two formulations based on Support Vector Machines for simultaneous classification and feature selection that explicitly incorporate attribute acquisition costs. This is a challenging task for two main reasons: the estimation of the acquisition costs is not straightforward and may depend on multivariate factors, and the inter-dependence between variables must be taken into account for the modelling process since companies usually acquire groups of related variables rather than acquiring them individually. Mixed-integer linear programming models are proposed for constructing classifiers that constrain acquisition costs while classifying adequately. Experimental results using credit scoring datasets demonstrate the effectiveness of our methods in terms of predictive performance at a low cost compared to well-known feature selection approaches.
Text
paper Risk SVM elsarticle
- Accepted Manuscript
More information
Accepted/In Press date: 22 February 2017
e-pub ahead of print date: 27 February 2017
Published date: 1 September 2017
Organisations:
Decision Analytics & Risk, Southampton Business School
Identifiers
Local EPrints ID: 408556
URI: http://eprints.soton.ac.uk/id/eprint/408556
ISSN: 0377-2217
PURE UUID: 19e1845c-7a45-457a-add9-27a47c17e4ff
Catalogue record
Date deposited: 23 May 2017 04:03
Last modified: 16 Mar 2024 05:05
Export record
Altmetrics
Contributors
Author:
Sebastián Maldonado
Author:
Juan Pérez
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics