The University of Southampton
University of Southampton Institutional Repository

Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud

Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud
Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud

The development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomaly detection are considered as promising candidates for this purpose. Commonly, a fraud data set contains both numeric and nominal attributes, where, due to the ease of expressiveness, the latter often encodes valuable expert knowledge. For this reason, an anomaly detection method should be able to handle a mixture of different data types, returning an anomaly score meaningful in the context of the business application. We propose the iForestCAD approach that computes conditional anomaly scores, useful for fraud detection. More specifically, anomaly detection is performed conditionally on well-defined data partitions that are created on the basis of selected numeric attributes and distinct combinations of values of selected nominal attributes. In this way, the resulting anomaly scores are computed with respect to a reference group of interest, thus representing a meaningful score for domain experts. Given that anomaly detection is performed conditionally, this approach allows detecting anomalies that would otherwise remain undiscovered in unconditional anomaly detection. Moreover, we present a case study in which we demonstrate the usefulness of our proposed approach on real-world workers’ compensation claims received from a large European insurance organization. As a result, the iForestCAD approach is greatly accepted by domain experts for its effective detection of fraudulent claims.

Conditional anomaly detection, Fraud detection, Isolation forest, Workers’ compensation insurance fraud
0167-9236
13-26
Stripling, Eugen
10c20791-45b8-48da-941f-3b3afb926fa9
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Chizi, Barak
8ae26e24-e51e-46ac-b5b7-ea41c9de691c
vanden Broucke, Seppe
89c69367-232e-4c1e-9e57-531bf474e12d
Stripling, Eugen
10c20791-45b8-48da-941f-3b3afb926fa9
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Chizi, Barak
8ae26e24-e51e-46ac-b5b7-ea41c9de691c
vanden Broucke, Seppe
89c69367-232e-4c1e-9e57-531bf474e12d

Stripling, Eugen, Baesens, Bart, Chizi, Barak and vanden Broucke, Seppe (2018) Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud. Decision Support Systems, 111, 13-26. (doi:10.1016/j.dss.2018.04.001).

Record type: Article

Abstract

The development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomaly detection are considered as promising candidates for this purpose. Commonly, a fraud data set contains both numeric and nominal attributes, where, due to the ease of expressiveness, the latter often encodes valuable expert knowledge. For this reason, an anomaly detection method should be able to handle a mixture of different data types, returning an anomaly score meaningful in the context of the business application. We propose the iForestCAD approach that computes conditional anomaly scores, useful for fraud detection. More specifically, anomaly detection is performed conditionally on well-defined data partitions that are created on the basis of selected numeric attributes and distinct combinations of values of selected nominal attributes. In this way, the resulting anomaly scores are computed with respect to a reference group of interest, thus representing a meaningful score for domain experts. Given that anomaly detection is performed conditionally, this approach allows detecting anomalies that would otherwise remain undiscovered in unconditional anomaly detection. Moreover, we present a case study in which we demonstrate the usefulness of our proposed approach on real-world workers’ compensation claims received from a large European insurance organization. As a result, the iForestCAD approach is greatly accepted by domain experts for its effective detection of fraudulent claims.

Text
Stripling2018 - Accepted Manuscript
Download (2MB)

More information

Accepted/In Press date: 17 April 2018
e-pub ahead of print date: 22 April 2018
Published date: July 2018
Keywords: Conditional anomaly detection, Fraud detection, Isolation forest, Workers’ compensation insurance fraud

Identifiers

Local EPrints ID: 422417
URI: http://eprints.soton.ac.uk/id/eprint/422417
ISSN: 0167-9236
PURE UUID: fc4a5e43-994d-49c1-b732-bafbfc0949d9
ORCID for Bart Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 23 Jul 2018 16:31
Last modified: 16 Mar 2024 06:37

Export record

Altmetrics

Contributors

Author: Eugen Stripling
Author: Bart Baesens ORCID iD
Author: Barak Chizi
Author: Seppe vanden Broucke

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×