The University of Southampton
University of Southampton Institutional Repository

Data engineering for fraud detection

Data engineering for fraud detection
Data engineering for fraud detection

Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting suspicious transactions is a binary classification problem and therefore many techniques can be applied. Interpretability is however of utmost importance for the management to have confidence in the model and for designing fraud prevention strategies. Moreover, models that enable the fraud experts to understand the underlying reasons why a case is flagged as suspicious will greatly facilitate their job of investigating the suspicious transactions. Therefore, we propose several data engineering techniques to improve the performance of an analytical model while retaining the interpretability property. Our data engineering process is decomposed into several feature and instance engineering steps. We illustrate the improvement in performance of these data engineering steps for popular analytical models on a real payment transactions data set.

Cost-based model evaluation, Decision analysis, Feature engineering, Instance engineering, Payment transactions fraud
0167-9236
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Höppner, Sebastiaan
26ec8e7e-f6ef-49e7-84fc-a01943ba6a46
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Höppner, Sebastiaan
26ec8e7e-f6ef-49e7-84fc-a01943ba6a46
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6

Baesens, Bart, Höppner, Sebastiaan and Verdonck, Tim (2021) Data engineering for fraud detection. Decision Support Systems, 150, [113492]. (doi:10.1016/j.dss.2021.113492).

Record type: Article

Abstract

Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting suspicious transactions is a binary classification problem and therefore many techniques can be applied. Interpretability is however of utmost importance for the management to have confidence in the model and for designing fraud prevention strategies. Moreover, models that enable the fraud experts to understand the underlying reasons why a case is flagged as suspicious will greatly facilitate their job of investigating the suspicious transactions. Therefore, we propose several data engineering techniques to improve the performance of an analytical model while retaining the interpretability property. Our data engineering process is decomposed into several feature and instance engineering steps. We illustrate the improvement in performance of these data engineering steps for popular analytical models on a real payment transactions data set.

Text
Data_Engineering_for_Payment_Transactions_Fraud-12 - Accepted Manuscript
Restricted to Repository staff only until 12 January 2023.
Request a copy

More information

Accepted/In Press date: 7 January 2021
e-pub ahead of print date: 12 January 2021
Additional Information: Funding Information: The authors gratefully acknowledge the financial support from the BNP Paribas Fortis Research Chair in Fraud Analytics at KU Leuven and the Internal Funds KU Leuven under grant C16/15/068 . Publisher Copyright: © 2021 Copyright: Copyright 2021 Elsevier B.V., All rights reserved.
Keywords: Cost-based model evaluation, Decision analysis, Feature engineering, Instance engineering, Payment transactions fraud

Identifiers

Local EPrints ID: 447430
URI: http://eprints.soton.ac.uk/id/eprint/447430
ISSN: 0167-9236
PURE UUID: 5fb42d82-7931-41b2-8d2d-862636496e57
ORCID for Bart Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 11 Mar 2021 17:34
Last modified: 27 Apr 2022 01:47

Export record

Altmetrics

Contributors

Author: Bart Baesens ORCID iD
Author: Sebastiaan Höppner
Author: Tim Verdonck

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×