Re: A simple but confusing question

From: Ian Jermyn (Ian.Jermyn_at_sophia.inria.fr)
Date: 10/06/04


Date: Wed, 6 Oct 2004 10:37:33 +0200

This is not an estimation problem. No loss function need be introduced to
solve it, as my derivation in a previous post makes clear. The proportion of
white balls in the bucket is a nuisance parameter, and should be integrated
out of the expression for the probability. To describe the problem as an
estimation problem risks giving a false impression of how to proceed in
problems more complex than this one. In this particular case, because

Pr(w_{n + 1} | W_{n}, n, N) = \int dp Pr(w_{n + 1} | p, W_{n}, n, N) Pr(p |
W_{n}, n, N)

= \int dp Pr(w_{n + 1} | p) Pr(p | W_{n}, n, N)

= \int dp p Pr(p | W_{n}, n, N)

= <p>

under the posterior distribution for p, i.e. because the likelihood is
linear in the parameter p, the MMSE estimate for p happens to agree with the
integration over the nuisance parameter. Other estimates do not so agree,
however, and this is the situation in general (although in certain other
simple cases similar coincidences do occur). The elimination of nuisance
parameters by integration is in general not the same as estimating the
parameter and substituting it into the likelihood.

Ian.

-- 
--------------------------------------------------
Ian Jermyn
ianjermyn@wanadoo.fr
"Alex Strashny" <alex_strashny@yahoo.com> a écrit dans le message de
news:e3b3c584.0410052305.1fb14d91@posting.google.com...
> szhao@darkwing.uoregon.edu (Shanyu Zhao) wrote in message
news:<96a39245.0409270026.63fd76c@posting.google.com>...
>
> > Here is the naive question:
> >
> > There are a large number of balls in a bucket, the white color balls
> > occupy p and the black balls (1-p). If p is unknown, when you pick 100
> > balls from the bucket, find that all of them are white. Then you pick
> > the 101st ball, what is the probability that the ball still a white
> > one?
>
> The answer can be anything from 0 to 1. (See below.) The answer that I
> would consider reasonable is 101/102.
>
> > Is this problem a parameter estimation or hypothesis testing? If we
> > use parameter estimation, clearly p=1, which means the probability is
> > 100%. But this is not true.
>
> This is not right.
>
> There are a couple of different definitions of probability. The
> frequentist definition states: if you pick n balls, and x of them are
> white, then probability of a white ball is p = (x/n) **as n goes to
> infinity**. People sometimes forget this last part, and it often does
> not matter; it matters here. 100 is not equal to infinity, so you
> cannot apply the frequentist definition to calculate probability.
>
> This is an estimation problem. p, the probability, is a parameter that
> you want to estimate. A model consists of two parts: a likelihood,
> which expresses the interaction between the parameters and the data;
> and a prior, which expresses your prior beliefs about the parameters.
>
> For this particular problem, the likelihood is Binomial(x | n,p). If
> the prior is Beta(a,b), then the best estimator of p (under quadratic
> loss) is p_hat = (x+a)/(n+a+b).
>
> What should a and b be? In principle, they could be anything. They
> express your beliefs about p before you drew any of the balls. That is
> why I said that your estimate of p could be anything between 0 and 1.
> If a = 1, b -> Infinity, then p_hat -> 0. But is such a prior
> reasonable? Probably not.
>
> If you have no prior information about p, then I would say, set a = b
> = 1. Beta(1,1) is the Uniform(0,1). This prior says that, a priori, p
> is equally likely to be anything between 0 and 1. Under this prior,
> p_hat = 101/102.
>
> Some people would say that if you have no prior information, set a = b
> = 0.5. (This is the Jeffrey's prior.) I would disagree, but in
> practice it doesn't matter. Under Jeffrey's prior, p_hat = 100.5 /
> 101.


Relevant Pages

  • Re: Shannons information theory
    ... random variable has entropy 0. ... The probability distribution associated with a random variable is almost ... the observed data as the probabilities) and bayesian estimation. ... and then it adjusts this prior based on the observed data. ...
    (comp.theory)
  • Re: A simple but confusing question
    ... > There are a large number of balls in a bucket, ... what is the probability that the ball still a white ... This is an estimation problem. ... and a prior, which expresses your prior beliefs about the parameters. ...
    (sci.stat.math)
  • Re: A simple but confusing question
    ... In practice, the prior that you use, provided it is relatively flat, ... affects the probability of the next ball being white only for small n. ... knowledge concerning the proportion of white balls in the bucket; ...
    (sci.stat.math)
  • Re: Probabilty Question
    ... I think that you asked whether you can tell what the heads probability ... paramater is in the interval. ... "prior" distribution on the value of the paramater. ... Estimation is a huge area of statistice, ...
    (sci.math)
  • Re: Beyond simple penalized regression
    ... The question is damning because a prior is not something ... just as the Gauss-Markov theorem shows that normality ... is not of great importance in least squares estimation, ... Bayes methods are only of value if they can be ...
    (sci.stat.math)