Re: A simple but confusing question

From: Alex Strashny (alex_strashny_at_yahoo.com)
Date: 10/06/04


Date: 6 Oct 2004 00:05:08 -0700

szhao@darkwing.uoregon.edu (Shanyu Zhao) wrote in message news:<96a39245.0409270026.63fd76c@posting.google.com>...

> Here is the naive question:
>
> There are a large number of balls in a bucket, the white color balls
> occupy p and the black balls (1-p). If p is unknown, when you pick 100
> balls from the bucket, find that all of them are white. Then you pick
> the 101st ball, what is the probability that the ball still a white
> one?

The answer can be anything from 0 to 1. (See below.) The answer that I
would consider reasonable is 101/102.

> Is this problem a parameter estimation or hypothesis testing? If we
> use parameter estimation, clearly p=1, which means the probability is
> 100%. But this is not true.

This is not right.

There are a couple of different definitions of probability. The
frequentist definition states: if you pick n balls, and x of them are
white, then probability of a white ball is p = (x/n) **as n goes to
infinity**. People sometimes forget this last part, and it often does
not matter; it matters here. 100 is not equal to infinity, so you
cannot apply the frequentist definition to calculate probability.

This is an estimation problem. p, the probability, is a parameter that
you want to estimate. A model consists of two parts: a likelihood,
which expresses the interaction between the parameters and the data;
and a prior, which expresses your prior beliefs about the parameters.

For this particular problem, the likelihood is Binomial(x | n,p). If
the prior is Beta(a,b), then the best estimator of p (under quadratic
loss) is p_hat = (x+a)/(n+a+b).

What should a and b be? In principle, they could be anything. They
express your beliefs about p before you drew any of the balls. That is
why I said that your estimate of p could be anything between 0 and 1.
If a = 1, b -> Infinity, then p_hat -> 0. But is such a prior
reasonable? Probably not.

If you have no prior information about p, then I would say, set a = b
= 1. Beta(1,1) is the Uniform(0,1). This prior says that, a priori, p
is equally likely to be anything between 0 and 1. Under this prior,
p_hat = 101/102.

Some people would say that if you have no prior information, set a = b
= 0.5. (This is the Jeffrey's prior.) I would disagree, but in
practice it doesn't matter. Under Jeffrey's prior, p_hat = 100.5 /
101.



Relevant Pages

  • Re: A simple but confusing question
    ... In practice, the prior that you use, provided it is relatively flat, ... affects the probability of the next ball being white only for small n. ... knowledge concerning the proportion of white balls in the bucket; ...
    (sci.stat.math)
  • Re: A simple but confusing question
    ... > Lewis Carroll thought N>n was enough (you want to take n+1 balls out) ... > Lewis Carroll's prior was P= C/2^N ... >>being white after n white balls is given by ...
    (sci.stat.math)
  • Re: A simple but confusing question
    ... This is not an estimation problem. ... white balls in the bucket is a nuisance parameter, ... out of the expression for the probability. ... > and a prior, which expresses your prior beliefs about the parameters. ...
    (sci.stat.math)
  • Re: Project for a Lazy Labor Day: photographing my balls
    ... How about the composition, should I have ... juggled my balls around prior to photographing them? ... fine details of my balls, can you make them out adequately? ...
    (rec.games.pinball)
  • Re: Project for a Lazy Labor Day: photographing my balls
    ... How about the composition, should I have ... juggled my balls around prior to photographing them? ... fine details of my balls, can you make them out adequately? ...
    (rec.games.pinball)