The University of Southampton
University of Southampton Institutional Repository

Instance-dependent cost-sensitive learning for detecting transfer fraud

Instance-dependent cost-sensitive learning for detecting transfer fraud
Instance-dependent cost-sensitive learning for detecting transfer fraud
Card transaction fraud is a growing problem affecting card holders worldwide. Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting fraudulent transactions is a binary classification problem. Classification models are commonly trained and evaluated in terms of statistical performance measures, such as likelihood and AUC, respectively. These measures, however, do not take into account the actual business objective, which is to minimize the financial losses due to fraud. Fraud detection is to be acknowledged as an instance-dependent cost-sensitive classification problem, where the costs due to misclassification vary between instances, and requiring adapted approaches for learning a classification model. In this article, an instance-dependent threshold is derived, based on the instance-dependent cost matrix for transfer fraud detection, that allows for making the optimal cost-based decision for each
transaction. Two novel classifiers are presented, based on lasso-regularized logistic regression and gradient tree boosting, which directly minimize the proposed instance-dependent cost measure when learning a classification model. The proposed methods are implemented in the R packages cslogit and csboost, and compared against state-of-the-art methods on a publicly available data set from the machine learning competition website Kaggle and a proprietary card transaction data set. The results of the experiments highlight the potential of reducing fraud losses by adopting the proposed methods.
0377-2217
291-300
Hoppner, Sebastiaan
eea1c2cb-4cf0-465c-8dd2-a05944dc2fd3
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verbeke, Wouter
78851583-d165-4bf7-bc1b-adbd81f6a836
Verdonck, Tim
60db0690-e4a2-41b2-b1e8-c9e21f0e9ec5
Hoppner, Sebastiaan
eea1c2cb-4cf0-465c-8dd2-a05944dc2fd3
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verbeke, Wouter
78851583-d165-4bf7-bc1b-adbd81f6a836
Verdonck, Tim
60db0690-e4a2-41b2-b1e8-c9e21f0e9ec5

Hoppner, Sebastiaan, Baesens, Bart, Verbeke, Wouter and Verdonck, Tim (2022) Instance-dependent cost-sensitive learning for detecting transfer fraud. European Journal of Operational Research, 291-300. (doi:10.1016/j.ejor.2021.05.028).

Record type: Article

Abstract

Card transaction fraud is a growing problem affecting card holders worldwide. Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting fraudulent transactions is a binary classification problem. Classification models are commonly trained and evaluated in terms of statistical performance measures, such as likelihood and AUC, respectively. These measures, however, do not take into account the actual business objective, which is to minimize the financial losses due to fraud. Fraud detection is to be acknowledged as an instance-dependent cost-sensitive classification problem, where the costs due to misclassification vary between instances, and requiring adapted approaches for learning a classification model. In this article, an instance-dependent threshold is derived, based on the instance-dependent cost matrix for transfer fraud detection, that allows for making the optimal cost-based decision for each
transaction. Two novel classifiers are presented, based on lasso-regularized logistic regression and gradient tree boosting, which directly minimize the proposed instance-dependent cost measure when learning a classification model. The proposed methods are implemented in the R packages cslogit and csboost, and compared against state-of-the-art methods on a publicly available data set from the machine learning competition website Kaggle and a proprietary card transaction data set. The results of the experiments highlight the potential of reducing fraud losses by adopting the proposed methods.

Text
cslogit_csboost - Accepted Manuscript
Download (1MB)

More information

Accepted/In Press date: 29 May 2021
e-pub ahead of print date: 12 October 2021
Published date: 16 February 2022

Identifiers

Local EPrints ID: 449397
URI: http://eprints.soton.ac.uk/id/eprint/449397
ISSN: 0377-2217
PURE UUID: be2b4b90-c617-441a-abf2-f5d0c995d2bd
ORCID for Bart Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 27 May 2021 16:30
Last modified: 17 Mar 2024 06:35

Export record

Altmetrics

Contributors

Author: Sebastiaan Hoppner
Author: Bart Baesens ORCID iD
Author: Wouter Verbeke
Author: Tim Verdonck

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×