The University of Southampton
University of Southampton Institutional Repository

Does segmentation always improve model performance in credit scoring?

Does segmentation always improve model performance in credit scoring?
Does segmentation always improve model performance in credit scoring?
Credit scoring allows for the credit risk assessment of bank customers. A single scoring model (scorecard) can be developed for the entire customer population, e.g. using logistic regression. However, it is often expected that segmentation, i.e. dividing the population into several groups and building separate scorecards for them, will improve the model performance. The most common statistical methods for segmentation are the two-step approaches, where logistic regression follows Classification and Regression Trees (CART) or Chi-squared Automatic Interaction Detection (CHAID) trees etc. In this research, the two-step approaches are applied as well as a new, simultaneous method, in which both segmentation and scorecards are optimised at the same time: Logistic Trees with Unbiased Selection (LOTUS). For reference purposes, a single-scorecard model is used. The above-mentioned methods are applied to the data provided by two of the major UK banks and one of the European credit bureaus. The model performance measures are then compared to examine whether there is improvement due to the segmentation methods used. It is found that segmentation does not always improve model performance in credit scoring: for none of the analysed real-world datasets, the multi-scorecard models perform considerably better than the single-scorecard ones. Moreover, in this application, there is no difference in performance between the two-step and simultaneous approaches
credit scoring, segmentation, logistic regression, cart, chaid, lotus
0957-4174
2433-2442
Bijak, Katarzyna
5130b6b9-fbf1-44e8-9106-1dd69c6692a6
Thomas, Lyn C.
a3ce3068-328b-4bce-889f-965b0b9d2362
Bijak, Katarzyna
5130b6b9-fbf1-44e8-9106-1dd69c6692a6
Thomas, Lyn C.
a3ce3068-328b-4bce-889f-965b0b9d2362

Bijak, Katarzyna and Thomas, Lyn C. (2012) Does segmentation always improve model performance in credit scoring? Expert Systems with Applications, 39 (3), 2433-2442. (doi:10.1016/j.eswa.2011.08.093).

Record type: Article

Abstract

Credit scoring allows for the credit risk assessment of bank customers. A single scoring model (scorecard) can be developed for the entire customer population, e.g. using logistic regression. However, it is often expected that segmentation, i.e. dividing the population into several groups and building separate scorecards for them, will improve the model performance. The most common statistical methods for segmentation are the two-step approaches, where logistic regression follows Classification and Regression Trees (CART) or Chi-squared Automatic Interaction Detection (CHAID) trees etc. In this research, the two-step approaches are applied as well as a new, simultaneous method, in which both segmentation and scorecards are optimised at the same time: Logistic Trees with Unbiased Selection (LOTUS). For reference purposes, a single-scorecard model is used. The above-mentioned methods are applied to the data provided by two of the major UK banks and one of the European credit bureaus. The model performance measures are then compared to examine whether there is improvement due to the segmentation methods used. It is found that segmentation does not always improve model performance in credit scoring: for none of the analysed real-world datasets, the multi-scorecard models perform considerably better than the single-scorecard ones. Moreover, in this application, there is no difference in performance between the two-step and simultaneous approaches

Text
Does_segmentation_always_improve_model_performance_in_credit_scoring.pdf - Accepted Manuscript
Download (535kB)

More information

e-pub ahead of print date: 30 August 2011
Published date: 15 February 2012
Keywords: credit scoring, segmentation, logistic regression, cart, chaid, lotus
Organisations: Southampton Business School

Identifiers

Local EPrints ID: 208555
URI: http://eprints.soton.ac.uk/id/eprint/208555
ISSN: 0957-4174
PURE UUID: abc9421a-c420-42d6-b07f-565b2599b521
ORCID for Katarzyna Bijak: ORCID iD orcid.org/0000-0003-1416-9045

Catalogue record

Date deposited: 20 Jan 2012 14:57
Last modified: 15 Mar 2024 03:36

Export record

Altmetrics

Contributors

Author: Katarzyna Bijak ORCID iD
Author: Lyn C. Thomas

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×