New publication: Confidence intervals for probabilistic network classifiers
- From: michael@xxxxxxxx
- Date: 5 Jun 2005 23:26:30 -0700
Dear colleagues,
A paper entitled "Confidence intervals for probabilistic network
classifiers" by M. Egmont-Petersen et al. has been published in the
Computational Statistics and Data Analysis Journal by Elsevier, 2005.
URL (click on 'Electronic reprint'):
http://www.egmont-petersen.nl/Abstract-CSDA2005.htm
Abstract:
Probabilistic networks (Bayesian networks) are suited as statistical
pattern classifiers when the feature variables are discrete. It is
argued that their white-box character makes them transparent, a
requirement in various applications such as, e.g., credit scoring. In
addition, the exact error rate of a probabilistic network classifier
can be computed without a dataset. First, the exact error rate for
probabilistic network classifiers is specified. Secondly, the exact
sampling distribution for the conditional probability estimates in a
probabilistic network classifier is derived. Each conditional
probability is distributed according to the bivariate binomial
distribution. Subsequently, an approach for computing the sampling
distribution and hence confidence intervals for the posterior
probability in a probabilistic network classifier is derived. Our
approach results in parametric bootstrap confidence intervals.
Experiments with general probabilistic network classifiers, the Naive
Bayes classifier and tree augmented Naive Bayes classifiers (TANs) show
that our approximation performs well. Also simulations performed with
the Alarm network show good results for large training sets. The amount
of computation required is exponential in the number of feature
variables. For medium and large-scale classification problems, our
approach is well suited for quick simulations. A running example from
the domain of credit scoring illustrates how to actually compute the
sampling distribution of the posterior probability.
Reference:
M. Egmont-Petersen, A. Feelders, B. Baesens. "Confidence intervals for
probabilistic network classifiers," Computational Statistics and Data
Analysis, Vol. 49, No. 4, pp. 998-1019, 2005.
Cheers!
Michael Egmont-Petersen
.
- Prev by Date: > Challenge
- Next by Date: Binary Time Series
- Previous by thread: > Challenge
- Next by thread: Binary Time Series
- Index(es):