Selected modelling problems in credit scoring
Selected modelling problems in credit scoring
This research addresses three selected modelling problems that occur in credit scoring. The focus is on segmentation, modelling Loss Given Default (LGD) for unsecured loans and affordability assessment. It is usually expected that segmentation, i.e. dividing the population into a number of groups and building separate scorecards for them, will improve the model performance. The most common statistical methods for segmentation are the two-step approaches, where logistic regression follows Classification and Regression Trees (CART) or Chi-square Automatic Interaction Detection (CHAID) trees. In this research, these approaches and a simultaneous method, in which both segmentation and scorecards are optimised at the same time: Logistic Trees with Unbiased Selection (LOTUS), are applied to the data provided by two UK banks and a European credit bureau. The model performance measures are compared to assess an improvement due to the segmentation. For unsecured retail loans, LGD is often found difficult to model. In the frequentist (classical) two-step approach, the first model (logistic regression) is used to separate positive values from zeroes and the second model (e.g. linear regression) is applied to estimate these values. Instead, one can build a Bayesian hierarchical model, which is a more coherent approach. In this research, Bayesian methods and the frequentist approach are applied to the data on personal loans provided by a UK bank. The Bayesian model generates an individual predictive distribution of LGD for each loan, whose potential applications include approximating the downturn LGD and stress testing LGD under Basel II. An applicant’s affordability (ability to repay) is often checked using a simple, static approach. In this research, a theoretical framework for dynamic affordability assessment is proposed. Both income and consumption are allowed to vary over time and their changes are described with random effects models for panel data. On their basis a simulation is run for a given applicant. The ability to repay is checked over the life of the loan and for all possible instalment amounts. As a result, a probability of default is assigned to each amount, which can help find the maximum affordable instalment. This is illustrated with an example based on artificial data.
Bijak, Katarzyna
5130b6b9-fbf1-44e8-9106-1dd69c6692a6
August 2013
Bijak, Katarzyna
5130b6b9-fbf1-44e8-9106-1dd69c6692a6
Thomas, Lyn C.
a3ce3068-328b-4bce-889f-965b0b9d2362
Bijak, Katarzyna
(2013)
Selected modelling problems in credit scoring.
University of Southampton, School of Management, Doctoral Thesis, 179pp.
Record type:
Thesis
(Doctoral)
Abstract
This research addresses three selected modelling problems that occur in credit scoring. The focus is on segmentation, modelling Loss Given Default (LGD) for unsecured loans and affordability assessment. It is usually expected that segmentation, i.e. dividing the population into a number of groups and building separate scorecards for them, will improve the model performance. The most common statistical methods for segmentation are the two-step approaches, where logistic regression follows Classification and Regression Trees (CART) or Chi-square Automatic Interaction Detection (CHAID) trees. In this research, these approaches and a simultaneous method, in which both segmentation and scorecards are optimised at the same time: Logistic Trees with Unbiased Selection (LOTUS), are applied to the data provided by two UK banks and a European credit bureau. The model performance measures are compared to assess an improvement due to the segmentation. For unsecured retail loans, LGD is often found difficult to model. In the frequentist (classical) two-step approach, the first model (logistic regression) is used to separate positive values from zeroes and the second model (e.g. linear regression) is applied to estimate these values. Instead, one can build a Bayesian hierarchical model, which is a more coherent approach. In this research, Bayesian methods and the frequentist approach are applied to the data on personal loans provided by a UK bank. The Bayesian model generates an individual predictive distribution of LGD for each loan, whose potential applications include approximating the downturn LGD and stress testing LGD under Basel II. An applicant’s affordability (ability to repay) is often checked using a simple, static approach. In this research, a theoretical framework for dynamic affordability assessment is proposed. Both income and consumption are allowed to vary over time and their changes are described with random effects models for panel data. On their basis a simulation is run for a given applicant. The ability to repay is checked over the life of the loan and for all possible instalment amounts. As a result, a probability of default is assigned to each amount, which can help find the maximum affordable instalment. This is illustrated with an example based on artificial data.
Text
Final PhD thesis - Kasia Bijak.pdf
- Other
More information
Published date: August 2013
Organisations:
University of Southampton, Southampton Business School
Identifiers
Local EPrints ID: 359285
URI: http://eprints.soton.ac.uk/id/eprint/359285
PURE UUID: 0ded7cc3-dc24-4f2e-bfe2-2604f4da9417
Catalogue record
Date deposited: 16 Dec 2013 13:45
Last modified: 15 Mar 2024 03:36
Export record
Contributors
Thesis advisor:
Lyn C. Thomas
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics