The University of Southampton
University of Southampton Institutional Repository

Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies

Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies
Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies

Predictive models are increasingly being used to optimize decision-making and minimize costs. A conventional approach is predict-then-optimize: first, a predictive model is built; then, this model is used to optimize decision-making. A drawback of this approach, however, is that it only incorporates costs in the second stage. Conversely, the predict-and-optimize approach proposes learning a predictive model by directly minimizing the cost of the downstream decision-making task. This is achieved by using a task-specific loss function incorporating the costs of different outcomes in the first stage, with the eventual aim of obtaining more cost-effective decisions in the second stage. This work compares both approaches in the context of cost-sensitive classification. Conceptually, we use the two-stage framework to categorize existing cost-sensitive learning methodologies by differentiating between methodologies for cost-sensitive model training and decision-making. Empirically, we compare and evaluate both approaches using different cost-sensitive training and decision-making methodologies, as well as both class-dependent and instance-dependent cost-sensitive methods. This is achieved using real-world data from a range of application areas and a combination of cost-sensitive and cost-insensitive performance measures. The key finding is that the decision-making strategy is generally found to be more effective than training with a task-specific loss or their combination.

Classification, Cost-sensitive learning, Instance-dependent costs, Supervised learning
0020-0255
400-415
Vanderschueren, Toon
9a22c052-d53c-4468-8862-d4792e73669f
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verbeke, Wouter
57c0d98a-130a-4202-b6dd-cdc6914f4732
Vanderschueren, Toon
9a22c052-d53c-4468-8862-d4792e73669f
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verbeke, Wouter
57c0d98a-130a-4202-b6dd-cdc6914f4732

Vanderschueren, Toon, Verdonck, Tim, Baesens, Bart and Verbeke, Wouter (2022) Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies. Information Sciences, 594, 400-415. (doi:10.1016/j.ins.2022.02.021).

Record type: Article

Abstract

Predictive models are increasingly being used to optimize decision-making and minimize costs. A conventional approach is predict-then-optimize: first, a predictive model is built; then, this model is used to optimize decision-making. A drawback of this approach, however, is that it only incorporates costs in the second stage. Conversely, the predict-and-optimize approach proposes learning a predictive model by directly minimizing the cost of the downstream decision-making task. This is achieved by using a task-specific loss function incorporating the costs of different outcomes in the first stage, with the eventual aim of obtaining more cost-effective decisions in the second stage. This work compares both approaches in the context of cost-sensitive classification. Conceptually, we use the two-stage framework to categorize existing cost-sensitive learning methodologies by differentiating between methodologies for cost-sensitive model training and decision-making. Empirically, we compare and evaluate both approaches using different cost-sensitive training and decision-making methodologies, as well as both class-dependent and instance-dependent cost-sensitive methods. This is achieved using real-world data from a range of application areas and a combination of cost-sensitive and cost-insensitive performance measures. The key finding is that the decision-making strategy is generally found to be more effective than training with a task-specific loss or their combination.

Text
A_49_S_Vanderschueren___Predict_then_optimize_or_predict_and_optimize (1) - Accepted Manuscript
Download (3MB)

More information

Accepted/In Press date: 10 February 2022
e-pub ahead of print date: 22 February 2022
Published date: 1 March 2022
Additional Information: Funding Information: This work was supported by the BNP Paribas Fortis Chair in Fraud Analytics and FWO research project G015020N. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation – Flanders (FWO) and the Flemish Government – department EWI.
Keywords: Classification, Cost-sensitive learning, Instance-dependent costs, Supervised learning

Identifiers

Local EPrints ID: 475356
URI: http://eprints.soton.ac.uk/id/eprint/475356
ISSN: 0020-0255
PURE UUID: e902b9eb-0130-486c-90fd-e8fd90e9a5a1
ORCID for Bart Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 16 Mar 2023 17:32
Last modified: 18 Mar 2024 05:29

Export record

Altmetrics

Contributors

Author: Toon Vanderschueren
Author: Tim Verdonck
Author: Bart Baesens ORCID iD
Author: Wouter Verbeke

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×