Re: Confidence interval on mean for a set of numbers
From: Richard Ulrich (Rich.Ulrich_at_comcast.net)
Date: 09/19/04
- Next message: Osher Doctorow: "Orbits Formed Elliptically From Hyperbolic Trajectories"
- Previous message: George Kahrimanis: "Re: Confidence interval on mean for a set of numbers"
- In reply to: Peter Michaux: "Re: Confidence interval on mean for a set of numbers"
- Next in thread: George Kahrimanis: "Re: Confidence interval on mean for a set of numbers"
- Messages sorted by: [ date ] [ thread ]
Date: Sun, 19 Sep 2004 11:46:09 -0400
On 18 Sep 2004 19:37:43 -0700, petermichaux@yahoo.com (Peter Michaux)
wrote:
> Richard,
>
> Before I started this thread I was doing something very similar to
> what you are suggesting. I took my big list of say 1000 numbers.
> Sorted the list. I used the 25th and 975th members of the list as my
> characterization of the distribution's width. I thought there would be
> a 95% chance that another random point would fall in this range.
This is a little bit subtle, so let's take this slowly. Start
with making sure we agree on terms -- This is a sample
of 1000, from an underlying population (much larger).
Now, if you know the 2.5th and 97.5th percentage points
for the population, you have a 95% chance that a point
falls in that range. But you only have an "estimate" of
the range, based on a sample drawn of 1000. And that
is random, and (implicitly) inaccurate. Consider replication
by drawing samples of 1000. "On the average," half the
replications would be wider, and half would be smaller.
The ends would never match exactly, if these are continuous
and real.
What you have are point estimates of the upper and lower
bounds. If you did a bunch of these, they would be too
wide in half the samples, and too narrow in half -- just by
assuming that there is a continuous distribution and no ties.
So, for your randomly estimated set of bounds, there is a
chance that they are too wide, and a chance that they are
too narrow.
Okay. Here is what you can say. There is a 50% chance
that 95% of future, random points will fall between those
bounds. *If* the failures were guaranteed to be symmetrical,
then it might extrapolate to being a more general 95% limit.
However, it is a slightly tougher, much more rigorous statement,
to be able to say that there is a 95% chance that an
interval encompasses a single, next point. For that, you need
a wider interval than the one observed in one sample of 1000.
>
> Unfortunately, a lot of the discussion here has gone over my head.
> Where can I read more about the method you are suggesting? Any book
> references or key words would help.
The method I showed was the Poisson estimation for small
proportions, applying it separately to both extremes. That is
approximate. It is pragmatic and easy, and I hoped that it
would be familiar. It should be mentioned in a 'non-parametric'
statistics book that describes rank-estimation of percentiles.
I don't know if it is in Sidney Siegal's classical cook-book
presentation from 1956 or its newer revision. It is probably
in William Conover's book.
Conover may also show the more exact estimation of proportions
which makes use of the inverse beta distribution. Google gives
some possibilities at < inverse-beta confidence-limit > , but I
did not look closer at those ("beta" yielded biological hits).
You might browse in Conover at your library, and also look
at other books shelved next to it.
-- Rich Ulrich, wpilib@Pitt.edu http://www.pitt.edu/~wpilib/index.html
- Next message: Osher Doctorow: "Orbits Formed Elliptically From Hyperbolic Trajectories"
- Previous message: George Kahrimanis: "Re: Confidence interval on mean for a set of numbers"
- In reply to: Peter Michaux: "Re: Confidence interval on mean for a set of numbers"
- Next in thread: George Kahrimanis: "Re: Confidence interval on mean for a set of numbers"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|