Data engineering for fraud detection
Data engineering for fraud detection
Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting suspicious transactions is a binary classification problem and therefore many techniques can be applied. Interpretability is however of utmost importance for the management to have confidence in the model and for designing fraud prevention strategies. Moreover, models that enable the fraud experts to understand the underlying reasons why a case is flagged as suspicious will greatly facilitate their job of investigating the suspicious transactions. Therefore, we propose several data engineering techniques to improve the performance of an analytical model while retaining the interpretability property. Our data engineering process is decomposed into several feature and instance engineering steps. We illustrate the improvement in performance of these data engineering steps for popular analytical models on a real payment transactions data set.
Cost-based model evaluation, Decision analysis, Feature engineering, Instance engineering, Payment transactions fraud
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Höppner, Sebastiaan
26ec8e7e-f6ef-49e7-84fc-a01943ba6a46
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6
November 2021
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Höppner, Sebastiaan
26ec8e7e-f6ef-49e7-84fc-a01943ba6a46
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6
Baesens, Bart, Höppner, Sebastiaan and Verdonck, Tim
(2021)
Data engineering for fraud detection.
Decision Support Systems, 150, [113492].
(doi:10.1016/j.dss.2021.113492).
Abstract
Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting suspicious transactions is a binary classification problem and therefore many techniques can be applied. Interpretability is however of utmost importance for the management to have confidence in the model and for designing fraud prevention strategies. Moreover, models that enable the fraud experts to understand the underlying reasons why a case is flagged as suspicious will greatly facilitate their job of investigating the suspicious transactions. Therefore, we propose several data engineering techniques to improve the performance of an analytical model while retaining the interpretability property. Our data engineering process is decomposed into several feature and instance engineering steps. We illustrate the improvement in performance of these data engineering steps for popular analytical models on a real payment transactions data set.
Text
Data_Engineering_for_Payment_Transactions_Fraud-12
- Accepted Manuscript
More information
Accepted/In Press date: 7 January 2021
e-pub ahead of print date: 12 January 2021
Published date: November 2021
Additional Information:
Funding Information:
The authors gratefully acknowledge the financial support from the BNP Paribas Fortis Research Chair in Fraud Analytics at KU Leuven and the Internal Funds KU Leuven under grant C16/15/068 .
Publisher Copyright:
© 2021
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
Keywords:
Cost-based model evaluation, Decision analysis, Feature engineering, Instance engineering, Payment transactions fraud
Identifiers
Local EPrints ID: 447430
URI: http://eprints.soton.ac.uk/id/eprint/447430
ISSN: 0167-9236
PURE UUID: 5fb42d82-7931-41b2-8d2d-862636496e57
Catalogue record
Date deposited: 11 Mar 2021 17:34
Last modified: 17 Mar 2024 06:20
Export record
Altmetrics
Contributors
Author:
Sebastiaan Höppner
Author:
Tim Verdonck
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics