The University of Southampton
University of Southampton Institutional Repository

Confidence intervals for probabilistic network classifiers

Confidence intervals for probabilistic network classifiers
Confidence intervals for probabilistic network classifiers
Probabilistic networks (Bayesian networks) are suited as statistical pattern classifiers when the feature variables are discrete. It is argued that their white-box character makes them transparent, a requirement in various applications such as, e.g., credit scoring. In addition, the exact error rate of a probabilistic network classifier can be computed without a dataset. First, the exact error rate for probabilistic network classifiers is specified. Secondly, the exact sampling distribution for the conditional probability estimates in a probabilistic network classifier is derived. Each conditional probability is distributed according to the bivariate binomial distribution. Subsequently, an approach for computing the sampling distribution and hence confidence intervals for the posterior probability in a probabilistic network classifier is derived. Our approach results in parametric bootstrap confidence intervals. Experiments with general probabilistic network classifiers, the Naive Bayes classifier and tree augmented Naive Bayes classifiers (TANs) show that our approximation performs well. Also simulations performed with the Alarm network show good results for large training sets. The amount of computation required is exponential in the number of feature variables. For medium and large-scale classification problems, our approach is well suited for quick simulations. A running example from the domain of credit scoring illustrates how to actually compute the sampling distribution of the posterior probability.
0167-9473
998-1019
Egmont-Petersen, M.
3c580589-1390-4358-9661-1b19a0d1247e
Feelders, A.
c4fa32ce-db57-4161-a495-f8ba1fea60fc
Baesens, B.
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Egmont-Petersen, M.
3c580589-1390-4358-9661-1b19a0d1247e
Feelders, A.
c4fa32ce-db57-4161-a495-f8ba1fea60fc
Baesens, B.
f7c6496b-aa7f-4026-8616-ca61d9e216f0

Egmont-Petersen, M., Feelders, A. and Baesens, B. (2005) Confidence intervals for probabilistic network classifiers. Computational Statistics and Data Analysis, 49 (4), 998-1019. (doi:10.1016/j.csda.2004.06.018).

Record type: Article

Abstract

Probabilistic networks (Bayesian networks) are suited as statistical pattern classifiers when the feature variables are discrete. It is argued that their white-box character makes them transparent, a requirement in various applications such as, e.g., credit scoring. In addition, the exact error rate of a probabilistic network classifier can be computed without a dataset. First, the exact error rate for probabilistic network classifiers is specified. Secondly, the exact sampling distribution for the conditional probability estimates in a probabilistic network classifier is derived. Each conditional probability is distributed according to the bivariate binomial distribution. Subsequently, an approach for computing the sampling distribution and hence confidence intervals for the posterior probability in a probabilistic network classifier is derived. Our approach results in parametric bootstrap confidence intervals. Experiments with general probabilistic network classifiers, the Naive Bayes classifier and tree augmented Naive Bayes classifiers (TANs) show that our approximation performs well. Also simulations performed with the Alarm network show good results for large training sets. The amount of computation required is exponential in the number of feature variables. For medium and large-scale classification problems, our approach is well suited for quick simulations. A running example from the domain of credit scoring illustrates how to actually compute the sampling distribution of the posterior probability.

This record has no associated files available for download.

More information

Published date: 2005
Organisations: Management

Identifiers

Local EPrints ID: 36735
URI: http://eprints.soton.ac.uk/id/eprint/36735
ISSN: 0167-9473
PURE UUID: c8ce3fdd-e120-4f3f-a76d-5f7537e15ed9
ORCID for B. Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 11 Jul 2006
Last modified: 16 Mar 2024 03:39

Export record

Altmetrics

Contributors

Author: M. Egmont-Petersen
Author: A. Feelders
Author: B. Baesens ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×