Re: A simple but confusing question
From: George Kahrimanis (anakreon_at_hol.gr)
Date: 09/30/04
- Next message: George Kahrimanis: "Re: Determining Confidence Interval"
- Previous message: Phil Sherrod: "Re: Multiple Regression w/ Polynomial-in-Y?"
- Next in thread: Ian Jermyn: "Re: A simple but confusing question"
- Reply: Ian Jermyn: "Re: A simple but confusing question"
- Maybe reply: A.G.McDowell: "Re: A simple but confusing question"
- Maybe reply: Alex Strashny: "Re: A simple but confusing question"
- Messages sorted by: [ date ] [ thread ]
Date: 30 Sep 2004 09:32:35 -0700
Shanyu Zhao wrote on 2004-09-27 01:26:38 PST
>Here is the naive question
It is a very good question! If we draw (with replacement, say)
100 white balls, what is the probability that the next one will
also be white?
Is it the same question as asking what is the ratio of white
balls in the bucket? We shall return to this distinction, at
the end of this text.
A lot of experiments in nucular oops... nuclear physics are just
like that, trying to form a conclusion (about the ratio of black
balls, as it were) from a plain count. It sometimes is a null
count, as in this case.
There is no consensus about the right way to analyse count data,
especially when the yield is small or null, as you can gather
from the three or more conferences, since 2000, on statistical
methods in high-energy physics (at CERN and at FermiLab in 2000,
and at Durham (UK) in 2002) and by the changing guidelines
issued by the Particle Data Group (compare before and after 1998).
Currently the powers that be favor the use of confidence intervals
but there is a strong and loud Bayesian opposition, splintered
into several (mutually incompatible) sects.
Like most naive users of statistics, physicists too are mislead
by the meaning of CIs, and they often feel a measure of post-data
confidence in each one of them. Consequently they have been
seriously disturbed by cases like the following two examples.
Example 1: if the process involves some "noise". Say that you roll
a die before drawing a ball, and if you get a `six' then you draw
a ball from another bucket, whose contents are, say, 5 black and 5
white balls. In nuclear physics we would say that we have a
known background (say, cosmic rays). Now on average that "noise"
brings us 100 * 1/6 * 5/10, i.e. 8.3 black balls, in a sample of
100. In cases when we would expect to see some count from a
background source, but we count little or nothing, the CIs become
ridiculously tiny (say, [0, 10^{-10}], woopee!) or the method may
fail to provide *any* CI.
Example 2: if the noise is not exactly known. E.g., when the
probability of the above mooted die showing '6' is not exactly 1/6
but is itself a random variable with some known mean and spread.
This knowledge increases our uncertainty about the conclusion,
but, hard to believe, the CIs become narrower when the spread
(in the probability of '6') increases!
After these (and other) examples, we understand at last that CIs
are not what are naively believed to be. They continue to be used
because the alternatives are also bugged with problems.
Ian Jermyn wrote on 2004-09-27 05:11:06 PST
> [...] If you want a long discussion of this point, see
>Ed Jaynes book, 'Probabiilty Theory: the Logic of Science',
>which anyway is important reading.
I, too, recommend this book, as a great excercise for the mind
and for the criticism it contains, but not as being right.
>(In a real problem, you would likely have some information
>about a, and things would be easier.)
Like, someone has put the balls there and has challenged you to
bet on the exact ratio. You know that there are a finite number
of combinations. You suppose that your opponent has tried to make
it as hard as posible for you to *guess* the ratio, by randomising
the way he stacks the bucket. Then you have grounds to presume prior
probability 1/M for each of the M possible arrangements of black
and white balls. In that case, the Bayesian method is justified.
>[...] if we assume that Pr(a) = 1, [...]
that is, a constant pdf (in *this* parametrization). I cannot help
wondering what would that mean. Like, God has played darts to set
the value of a? (Btw, I confess that I have used this method in
my analysis of some neutrino data, but in hindsight I see that my
main motivation was to avoid CIs, not any deep Bayesian conviction.)
Imo one of the endearing traits of Bayesian treatments with
makeshift priors is their obvious weakness in the first step.
On the other hand, CIs and other approaches seem good at first
look, but the user is in for surprises.
To address the distinction I made in the beginning: the predictive
probability of `white'-in-the-next-trial is of course the same as
the underlying probability of `white', provided that the latter
probability can be defined in a strict sense. With a real bucket, we
can always count the black and white balls, and the probability of
white is well defined.
However, in other cases we are forever unable to check "the contents
of the bucket". Particle-physics experiments are in that case,
unless you believe in direct communication with supernatural beings.
Then one may regard the underlying probability as an undefined
concept, and try to calculate predictive probability disregarding
the underlying probability. Need I say, this is not the current
mainstream treatment.
Just to give you an idea of nonparametric predictive inference,
let me repeat the solution for an (incompletely discussed) problem
in another thread this month. If we have a sample of 1000 numbers
from a random process of totally unknown pdf, then the 1001-th
outcome is expected to pop up in any of the 1001 formed intervals,
with probability 1/1001. This aint too complicated! I hope to
find a simple solution in the case of the B/W balls.
I have started hatching a solution for this problem, but I feel I
need to retrace it a few times, watching for any mistake in the
logic. Here is a preview: the preliminary result is that the
probability of the 101-th ball being white is not defined precisely
but as if somehow bounded inside the interval [100/101, 1].
Thank for the opportunity to rethink this problem!
~ George Kahrimanis
- Next message: George Kahrimanis: "Re: Determining Confidence Interval"
- Previous message: Phil Sherrod: "Re: Multiple Regression w/ Polynomial-in-Y?"
- Next in thread: Ian Jermyn: "Re: A simple but confusing question"
- Reply: Ian Jermyn: "Re: A simple but confusing question"
- Maybe reply: A.G.McDowell: "Re: A simple but confusing question"
- Maybe reply: Alex Strashny: "Re: A simple but confusing question"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|