The University of Southampton
University of Southampton Institutional Repository

Computational efficient approximations of the concordance probability in a big data setting

Computational efficient approximations of the concordance probability in a big data setting
Computational efficient approximations of the concordance probability in a big data setting
Performance measurement is an essential task once a statistical model is created. The area under the receiving operating characteristics curve (AUC) is the most popular measure for evaluating the quality of a binary classifier. In this case, the AUC is equal to the concordance probability, a frequently used measure to evaluate the discriminatory power of the model. Contrary to AUC, the concordance probability can also be extended to the situation with a continuous response variable. Due to the staggering size of data sets nowadays, determining this discriminatory measure requires a tremendous amount of costly computations and is hence immensely time consuming, certainly in case of a continuous response variable. Therefore, we propose two estimation methods that calculate the concordance probability in a fast and accurate way and that can be applied to both the discrete and continuous setting. Extensive simulation studies show the excellent performance and fast computing times of both estimators. Finally, experiments on two real-life data sets confirm the conclusions of the artificial simulations.
2167-6461
Van Oirbeek, Robin
95b581d9-3e86-4ae4-86be-0e588e85ffce
Ponnet, Jolien
9c82af61-aac2-4fb6-9a5f-30937c10583f
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6
Van Oirbeek, Robin
95b581d9-3e86-4ae4-86be-0e588e85ffce
Ponnet, Jolien
9c82af61-aac2-4fb6-9a5f-30937c10583f
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Verdonck, Tim
8558b8f8-d412-4fb9-9784-9aba1d7323b6

Van Oirbeek, Robin, Ponnet, Jolien, Baesens, Bart and Verdonck, Tim (2023) Computational efficient approximations of the concordance probability in a big data setting. Big Data. (doi:10.1089/big.2022.0107).

Record type: Article

Abstract

Performance measurement is an essential task once a statistical model is created. The area under the receiving operating characteristics curve (AUC) is the most popular measure for evaluating the quality of a binary classifier. In this case, the AUC is equal to the concordance probability, a frequently used measure to evaluate the discriminatory power of the model. Contrary to AUC, the concordance probability can also be extended to the situation with a continuous response variable. Due to the staggering size of data sets nowadays, determining this discriminatory measure requires a tremendous amount of costly computations and is hence immensely time consuming, certainly in case of a continuous response variable. Therefore, we propose two estimation methods that calculate the concordance probability in a fast and accurate way and that can be applied to both the discrete and continuous setting. Extensive simulation studies show the excellent performance and fast computing times of both estimators. Finally, experiments on two real-life data sets confirm the conclusions of the artificial simulations.

Text
ConcProb-3 - Accepted Manuscript
Download (431kB)

More information

Accepted/In Press date: 24 April 2023
e-pub ahead of print date: 7 June 2023

Identifiers

Local EPrints ID: 477599
URI: http://eprints.soton.ac.uk/id/eprint/477599
ISSN: 2167-6461
PURE UUID: b52e384a-bdc2-42c2-b299-eb453ad84bf8
ORCID for Bart Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 09 Jun 2023 16:33
Last modified: 17 Mar 2024 02:59

Export record

Altmetrics

Contributors

Author: Robin Van Oirbeek
Author: Jolien Ponnet
Author: Bart Baesens ORCID iD
Author: Tim Verdonck

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×