Re: A simple but confusing question

From: Ian Jermyn (Ian.Jermyn_at_sophia.inria.fr)
Date: 10/01/04


Date: Fri, 1 Oct 2004 13:01:30 +0200

1) In practice, the prior that you use, provided it is relatively flat,
affects the probability of the next ball being white only for small n. Thus
the distinction between them is really irrelevant.

2) It is important to realise that the prior concerns your state of
knowledge concerning the proportion of white balls in the bucket; it has
nothing to do with an opponent or with God playing darts. There is no way to
avoid this prior knowledge. Confidence intervals, which anyway are full of
problems, are based implicitly on a uniform prior, although they do not
recognize its existence. Thus they blindly take this 'first step' without
even thinking about it.

3) It is easy to construct a parametrization invariant prior in the
continuous case, but it is more complicated, and I did not want to get into
the details. In any case, see point (1).

4) It is true that construction of ignorance priors is the subject of
ongoing research, and should not be regarded as a settled matter. This does
not obviate the need for priors however.

5) Probabilities cannot be empirically verifed: they do not represent a
statement about the physical world but about one's knowledge of it. Long-run
frequencies can be predicted and tested. Devaitions from these predictions
are indications that something is wrong with your model, and are therefore
very useful. They do not indicate that something was wrong with your
probabilities, which represent what you know given your model.

6) There are no such things as 'predictive probability'and 'underlying
probability'. There is the probability and there is the proportion of white
balls in the bucket. One is empirically verifiable, the other is not.

Ian.

-- 
--------------------------------------------------
Ian Jermyn
ianjermyn@wanadoo.fr
"George Kahrimanis" <anakreon@hol.gr> a écrit dans le message de
news:3ce8f26b.0409300832.3eebf846@posting.google.com...
> Shanyu Zhao wrote on 2004-09-27 01:26:38 PST
>
> >Here is the naive question
>
> It is a very good question! If we draw (with replacement, say)
> 100 white balls, what is the probability that the next one will
> also be white?
>
> Is it the same question as asking what is the ratio of white
> balls in the bucket? We shall return to this distinction, at
> the end of this text.
>
> A lot of experiments in nucular oops... nuclear physics are just
> like that, trying to form a conclusion (about the ratio of black
> balls, as it were) from a plain count. It sometimes is a null
> count, as in this case.
>
> There is no consensus about the right way to analyse count data,
> especially when the yield is small or null, as you can gather
> from the three or more conferences, since 2000, on statistical
> methods in high-energy physics (at CERN and at FermiLab in 2000,
> and at Durham (UK) in 2002) and by the changing guidelines
> issued by the Particle Data Group (compare before and after 1998).
> Currently the powers that be favor the use of confidence intervals
> but there is a strong and loud Bayesian opposition, splintered
> into several (mutually incompatible) sects.
>
> Like most naive users of statistics, physicists too are mislead
> by the meaning of CIs, and they often feel a measure of post-data
> confidence in each one of them. Consequently they have been
> seriously disturbed by cases like the following two examples.
>
> Example 1: if the process involves some "noise". Say that you roll
> a die before drawing a ball, and if you get a `six' then you draw
> a ball from another bucket, whose contents are, say, 5 black and 5
> white balls. In nuclear physics we would say that we have a
> known background (say, cosmic rays). Now on average that "noise"
> brings us 100 * 1/6 * 5/10, i.e. 8.3 black balls, in a sample of
> 100. In cases when we would expect to see some count from a
> background source, but we count little or nothing, the CIs become
> ridiculously tiny (say, [0, 10^{-10}], woopee!) or the method may
> fail to provide *any* CI.
>
> Example 2: if the noise is not exactly known. E.g., when the
> probability of the above mooted die showing '6' is not exactly 1/6
> but is itself a random variable with some known mean and spread.
> This knowledge increases our uncertainty about the conclusion,
> but, hard to believe, the CIs become narrower when the spread
> (in the probability of '6') increases!
>
> After these (and other) examples, we understand at last that CIs
> are not what are naively believed to be. They continue to be used
> because the alternatives are also bugged with problems.
>
> Ian Jermyn wrote on 2004-09-27 05:11:06 PST
>
> > [...] If you want a long discussion of this point, see
> >Ed Jaynes book, 'Probabiilty Theory: the Logic of Science',
> >which anyway is important reading.
>
> I, too, recommend this book, as a great excercise for the mind
> and for the criticism it contains, but not as being right.
>
> >(In a real problem, you would likely have some information
> >about a, and things would be easier.)
>
> Like, someone has put the balls there and has challenged you to
> bet on the exact ratio. You know that there are a finite number
> of combinations. You suppose that your opponent has tried to make
> it as hard as posible for you to *guess* the ratio, by randomising
> the way he stacks the bucket. Then you have grounds to presume prior
> probability 1/M for each of the M possible arrangements of black
> and white balls. In that case, the Bayesian method is justified.
>
> >[...] if we assume that Pr(a) = 1, [...]
>
> that is, a constant pdf (in *this* parametrization). I cannot help
> wondering what would that mean. Like, God has played darts to set
> the value of a? (Btw, I confess that I have used this method in
> my analysis of some neutrino data, but in hindsight I see that my
> main motivation was to avoid CIs, not any deep Bayesian conviction.)
>
> Imo one of the endearing traits of Bayesian treatments with
> makeshift priors is their obvious weakness in the first step.
> On the other hand, CIs and other approaches seem good at first
> look, but the user is in for surprises.
>
> To address the distinction I made in the beginning: the predictive
> probability of `white'-in-the-next-trial is of course the same as
> the underlying probability of `white', provided that the latter
> probability can be defined in a strict sense. With a real bucket, we
> can always count the black and white balls, and the probability of
> white is well defined.
>
> However, in other cases we are forever unable to check "the contents
> of the bucket". Particle-physics experiments are in that case,
> unless you believe in direct communication with supernatural beings.
> Then one may regard the underlying probability as an undefined
> concept, and try to calculate predictive probability disregarding
> the underlying probability. Need I say, this is not the current
> mainstream treatment.
>
> Just to give you an idea of nonparametric predictive inference,
> let me repeat the solution for an (incompletely discussed) problem
> in another thread this month. If we have a sample of 1000 numbers
> from a random process of totally unknown pdf, then the 1001-th
> outcome is expected to pop up in any of the 1001 formed intervals,
> with probability 1/1001. This aint too complicated! I hope to
> find a simple solution in the case of the B/W balls.
>
> I have started hatching a solution for this problem, but I feel I
> need to retrace it a few times, watching for any mistake in the
> logic. Here is a preview: the preliminary result is that the
> probability of the 101-th ball being white is not defined precisely
> but as if somehow bounded inside the interval [100/101, 1].
>
> Thank for the opportunity to rethink this problem!
> ~ George Kahrimanis


Relevant Pages

  • Re: A simple but confusing question
    ... affects the probability of the next ball being white only ... I just do not concede that a prior always exists! ... he has randomised his stacking of the bucket, ... The problem is to find the pdf of "x2, ...
    (sci.stat.math)
  • Re: An interesting problem in probability
    ... Balls that are arriving have equal probability ... how many times will the bucket be having balls of only c colors ... The steady state solution does not depend on N/b being ...
    (sci.math.num-analysis)
  • Re: A simple but confusing question
    ... 100 white balls, what is the probability that the next one will ... by the meaning of CIs, and they often feel a measure of post-data ... a ball from another bucket, whose contents are, say, 5 black and 5 ... probability of the above mooted die showing '6' is not exactly 1/6 ...
    (sci.stat.math)
  • Re: Which is rarer?
    ... Comparing actual events to get the probability ... Waqar Younis takes a wicket every 30 balls ... A bowler can bowl a max of 60 deliveries. ... there are always four or more wickets left for him to take. ...
    (rec.sport.cricket)
  • Re: Randomness
    ... the selections are not _independently_ random because once a ball is ... remaining balls is changed. ... Also, re the fair coin, a system can be random even if its states do ... not all have the same probability. ...
    (talk.origins)