Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud
Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud
The development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomaly detection are considered as promising candidates for this purpose. Commonly, a fraud data set contains both numeric and nominal attributes, where, due to the ease of expressiveness, the latter often encodes valuable expert knowledge. For this reason, an anomaly detection method should be able to handle a mixture of different data types, returning an anomaly score meaningful in the context of the business application. We propose the iForestCAD approach that computes conditional anomaly scores, useful for fraud detection. More specifically, anomaly detection is performed conditionally on well-defined data partitions that are created on the basis of selected numeric attributes and distinct combinations of values of selected nominal attributes. In this way, the resulting anomaly scores are computed with respect to a reference group of interest, thus representing a meaningful score for domain experts. Given that anomaly detection is performed conditionally, this approach allows detecting anomalies that would otherwise remain undiscovered in unconditional anomaly detection. Moreover, we present a case study in which we demonstrate the usefulness of our proposed approach on real-world workers’ compensation claims received from a large European insurance organization. As a result, the iForestCAD approach is greatly accepted by domain experts for its effective detection of fraudulent claims.
Conditional anomaly detection, Fraud detection, Isolation forest, Workers’ compensation insurance fraud
13-26
Stripling, Eugen
10c20791-45b8-48da-941f-3b3afb926fa9
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Chizi, Barak
8ae26e24-e51e-46ac-b5b7-ea41c9de691c
vanden Broucke, Seppe
89c69367-232e-4c1e-9e57-531bf474e12d
July 2018
Stripling, Eugen
10c20791-45b8-48da-941f-3b3afb926fa9
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Chizi, Barak
8ae26e24-e51e-46ac-b5b7-ea41c9de691c
vanden Broucke, Seppe
89c69367-232e-4c1e-9e57-531bf474e12d
Stripling, Eugen, Baesens, Bart, Chizi, Barak and vanden Broucke, Seppe
(2018)
Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud.
Decision Support Systems, 111, .
(doi:10.1016/j.dss.2018.04.001).
Abstract
The development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomaly detection are considered as promising candidates for this purpose. Commonly, a fraud data set contains both numeric and nominal attributes, where, due to the ease of expressiveness, the latter often encodes valuable expert knowledge. For this reason, an anomaly detection method should be able to handle a mixture of different data types, returning an anomaly score meaningful in the context of the business application. We propose the iForestCAD approach that computes conditional anomaly scores, useful for fraud detection. More specifically, anomaly detection is performed conditionally on well-defined data partitions that are created on the basis of selected numeric attributes and distinct combinations of values of selected nominal attributes. In this way, the resulting anomaly scores are computed with respect to a reference group of interest, thus representing a meaningful score for domain experts. Given that anomaly detection is performed conditionally, this approach allows detecting anomalies that would otherwise remain undiscovered in unconditional anomaly detection. Moreover, we present a case study in which we demonstrate the usefulness of our proposed approach on real-world workers’ compensation claims received from a large European insurance organization. As a result, the iForestCAD approach is greatly accepted by domain experts for its effective detection of fraudulent claims.
Text
Stripling2018
- Accepted Manuscript
More information
Accepted/In Press date: 17 April 2018
e-pub ahead of print date: 22 April 2018
Published date: July 2018
Keywords:
Conditional anomaly detection, Fraud detection, Isolation forest, Workers’ compensation insurance fraud
Identifiers
Local EPrints ID: 422417
URI: http://eprints.soton.ac.uk/id/eprint/422417
ISSN: 0167-9236
PURE UUID: fc4a5e43-994d-49c1-b732-bafbfc0949d9
Catalogue record
Date deposited: 23 Jul 2018 16:31
Last modified: 16 Mar 2024 06:37
Export record
Altmetrics
Contributors
Author:
Eugen Stripling
Author:
Barak Chizi
Author:
Seppe vanden Broucke
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics