Re: Unpredictability of a statistic
clemenr_at_wmin.ac.uk
Date: 02/11/05
- Next message: Nick Maclaren: "Re: median of combined sets?"
- Previous message: Jean: "Bayes Health Monitoring"
- In reply to: Anon.: "Re: Unpredictability of a statistic"
- Messages sorted by: [ date ] [ thread ]
Date: 11 Feb 2005 07:00:23 -0800
Anon. wrote:
> clemenr@wmin.ac.uk wrote:
[....]
> What might be problematic is if you want to make inferences about a
> larger population, from which your 6 authors is a sample. In this
case,
> the std dev as calculated above is still fine as an estimator
(assuming
> random sampling etc.), but now has an uncertainty. I think I've
reduced
> the problem down to a standard inferetial problem: estimating the std
> dev of a populaton from a sample.
Yes, I do wish to make inferences about a larger population. In
particular, I have a maximum of 86 authors. I would like to be able to
make statements about how good my maximum set is at making predictions
on "everybody". At the present, I can't push the number 86 upwards
much, so I was going to sort of "look the other way" by first selecting
5 authors out of 6, then 5 out of 7, then 5 out of 8, and so on. If I'm
measuring the variance of my statistic (which is a variance itself but
I was confusing myself writing it), then I can look to see if I get a
simple trend as the "maximum" number of authors increases, and then try
to predict from that. I haven't done the experiments, and hence don't
know what sort of graph I would get.
If I take the numbers:
-11 -11 -11 11 11 11
I get a standard deviation of 12.0499. If I take the numbers:
-11.119125 -6.206514 2.240835 11.833689 14.159427 18.644495
then I get a standard deviation of 11.91435, which is similar, but
smaller. The sets of numbers seem to be different to me, in particular
I'd have much more confidence in predicting the next number given the
results in the first set compared to the number in the second set.
Thinking about things, I *think* the central limit theorem will help me
out. The distribution of the variances of different experiments should
be ~ normal, no? If so, then a goodness of fit to normal would help me
distinguish the first and second cases above, with the second sample
being a better fit. Any unusual patterns such as "bunching" would
reduce the fit, ..., I think. And, I wouldn't have any ad-hoc decisions
to make as I would if I used a kernel density estimation.
> My other thought is that this looks the sort of problem you'll find
> discussed in books on the bootstrap.
Aha, I have such a book, *somewhere* around here. I do remember that
while the bootstrap is claimed to be a good estimator for many
statistics, that it doesn't work well for estimating the range. I can
look at how ranges
With the variance of the statistic, high and low range, and goodness of
fit to normal, I should have a fair amount of things to look at to see
what happens when I raise the "maximum" number of authors. I hope :-)
Thanks very much for your comments.
Cheers,
Ross-c
- Next message: Nick Maclaren: "Re: median of combined sets?"
- Previous message: Jean: "Bayes Health Monitoring"
- In reply to: Anon.: "Re: Unpredictability of a statistic"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|