Instance-dependent cost-sensitive learning for detecting transfer fraud
Instance-dependent cost-sensitive learning for detecting transfer fraud
Card transaction fraud is a growing problem affecting card holders worldwide. Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting fraudulent transactions is a binary classification problem. Classification models are commonly trained and evaluated in terms of statistical performance measures, such as likelihood and AUC, respectively. These measures, however, do not take into account the actual business objective, which is to minimize the financial losses due to fraud. Fraud detection is to be acknowledged as an instance-dependent cost-sensitive classification problem, where the costs due to misclassification vary between instances, and requiring adapted approaches for learning a classification model. In this article, an instance-dependent threshold is derived, based on the instance-dependent cost matrix for transfer fraud detection, that allows for making the optimal cost-based decision for each
transaction. Two novel classifiers are presented, based on lasso-regularized logistic regression and gradient tree boosting, which directly minimize the proposed instance-dependent cost measure when learning a classification model. The proposed methods are implemented in the R packages cslogit and csboost, and compared against state-of-the-art methods on a publicly available data set from the machine learning competition website Kaggle and a proprietary card transaction data set. The results of the experiments highlight the potential of reducing fraud losses by adopting the proposed methods.
291-300
Hoppner, Sebastiaan
eea1c2cb-4cf0-465c-8dd2-a05944dc2fd3
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verbeke, Wouter
78851583-d165-4bf7-bc1b-adbd81f6a836
Verdonck, Tim
60db0690-e4a2-41b2-b1e8-c9e21f0e9ec5
16 February 2022
Hoppner, Sebastiaan
eea1c2cb-4cf0-465c-8dd2-a05944dc2fd3
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verbeke, Wouter
78851583-d165-4bf7-bc1b-adbd81f6a836
Verdonck, Tim
60db0690-e4a2-41b2-b1e8-c9e21f0e9ec5
Hoppner, Sebastiaan, Baesens, Bart, Verbeke, Wouter and Verdonck, Tim
(2022)
Instance-dependent cost-sensitive learning for detecting transfer fraud.
European Journal of Operational Research, .
(doi:10.1016/j.ejor.2021.05.028).
Abstract
Card transaction fraud is a growing problem affecting card holders worldwide. Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting fraudulent transactions is a binary classification problem. Classification models are commonly trained and evaluated in terms of statistical performance measures, such as likelihood and AUC, respectively. These measures, however, do not take into account the actual business objective, which is to minimize the financial losses due to fraud. Fraud detection is to be acknowledged as an instance-dependent cost-sensitive classification problem, where the costs due to misclassification vary between instances, and requiring adapted approaches for learning a classification model. In this article, an instance-dependent threshold is derived, based on the instance-dependent cost matrix for transfer fraud detection, that allows for making the optimal cost-based decision for each
transaction. Two novel classifiers are presented, based on lasso-regularized logistic regression and gradient tree boosting, which directly minimize the proposed instance-dependent cost measure when learning a classification model. The proposed methods are implemented in the R packages cslogit and csboost, and compared against state-of-the-art methods on a publicly available data set from the machine learning competition website Kaggle and a proprietary card transaction data set. The results of the experiments highlight the potential of reducing fraud losses by adopting the proposed methods.
Text
cslogit_csboost
- Accepted Manuscript
More information
Accepted/In Press date: 29 May 2021
e-pub ahead of print date: 12 October 2021
Published date: 16 February 2022
Identifiers
Local EPrints ID: 449397
URI: http://eprints.soton.ac.uk/id/eprint/449397
ISSN: 0377-2217
PURE UUID: be2b4b90-c617-441a-abf2-f5d0c995d2bd
Catalogue record
Date deposited: 27 May 2021 16:30
Last modified: 17 Mar 2024 06:35
Export record
Altmetrics
Contributors
Author:
Sebastiaan Hoppner
Author:
Wouter Verbeke
Author:
Tim Verdonck
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics