The University of Southampton
University of Southampton Institutional Repository

Sample selection bias in credit scoring models

Sample selection bias in credit scoring models
Sample selection bias in credit scoring models
One of the aims of credit scoring models is to predict the probability of repayment of any applicant and yet such models are usually parameterised using a sample of accepted applicants only. This may lead to biased estimates of the parameters. In this paper we examine two issues. First, we compare the classification accuracy of a model based only on accepted applicants, relative to one based on a sample of all applicants. We find only a minimal difference, given the cutoff scores for the old model used by the data supplier. Using a simulated model we examine the predictive performance of models estimated from bands of applicants, ranked by predicted creditworthiness. We find that the lower the risk band of the training sample, the less accurate the predictions for all applicants. We also find that the lower the risk band of the training sample, the greater the overestimate of the true performance of the model, when tested on a sample of applicants within the same risk band ¾ as a financial institution would do. The overestimation may be very large. Second, we examine the predictive accuracy of a bivariate probit model with selection (BVP). This parameterises the accept-reject model allowing for (unknown) omitted variables to be correlated with those of the original good-bad model. The BVP model may improve accuracy if the loan officer has overridden a scoring rule. We find that a small improvement when using the BVP model is sometimes possible.
credit scoring, reject inference, sample selection
0160-5682
822-832
Banasik, J.
a3ce3068-328b-4bce-889f-965b0b9d2362
Crook, J.
3dc59075-7ed4-486c-84c5-bbe73108be2a
Thomas, L.
7bfb1cd3-c990-4617-b97a-4d44314ec11c
Banasik, J.
a3ce3068-328b-4bce-889f-965b0b9d2362
Crook, J.
3dc59075-7ed4-486c-84c5-bbe73108be2a
Thomas, L.
7bfb1cd3-c990-4617-b97a-4d44314ec11c

Banasik, J., Crook, J. and Thomas, L. (2003) Sample selection bias in credit scoring models. Journal of the Operational Research Society, 54 (8), 822-832. (doi:10.1057/palgrave.jors.2601578).

Record type: Article

Abstract

One of the aims of credit scoring models is to predict the probability of repayment of any applicant and yet such models are usually parameterised using a sample of accepted applicants only. This may lead to biased estimates of the parameters. In this paper we examine two issues. First, we compare the classification accuracy of a model based only on accepted applicants, relative to one based on a sample of all applicants. We find only a minimal difference, given the cutoff scores for the old model used by the data supplier. Using a simulated model we examine the predictive performance of models estimated from bands of applicants, ranked by predicted creditworthiness. We find that the lower the risk band of the training sample, the less accurate the predictions for all applicants. We also find that the lower the risk band of the training sample, the greater the overestimate of the true performance of the model, when tested on a sample of applicants within the same risk band ¾ as a financial institution would do. The overestimation may be very large. Second, we examine the predictive accuracy of a bivariate probit model with selection (BVP). This parameterises the accept-reject model allowing for (unknown) omitted variables to be correlated with those of the original good-bad model. The BVP model may improve accuracy if the loan officer has overridden a scoring rule. We find that a small improvement when using the BVP model is sometimes possible.

This record has no associated files available for download.

More information

Published date: 2003
Keywords: credit scoring, reject inference, sample selection

Identifiers

Local EPrints ID: 35912
URI: http://eprints.soton.ac.uk/id/eprint/35912
ISSN: 0160-5682
PURE UUID: a3a79b91-6d8f-4c3b-8199-b716ee45f7e6

Catalogue record

Date deposited: 23 May 2006
Last modified: 15 Mar 2024 07:55

Export record

Altmetrics

Contributors

Author: J. Banasik
Author: J. Crook
Author: L. Thomas

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×