Re: Unpredictability of a statistic

clemenr_at_wmin.ac.uk
Date: 02/11/05


Date: 11 Feb 2005 07:00:23 -0800

Anon. wrote:
> clemenr@wmin.ac.uk wrote:
[....]
> What might be problematic is if you want to make inferences about a
> larger population, from which your 6 authors is a sample. In this
case,
> the std dev as calculated above is still fine as an estimator
(assuming
> random sampling etc.), but now has an uncertainty. I think I've
reduced
> the problem down to a standard inferetial problem: estimating the std
> dev of a populaton from a sample.

Yes, I do wish to make inferences about a larger population. In
particular, I have a maximum of 86 authors. I would like to be able to
make statements about how good my maximum set is at making predictions
on "everybody". At the present, I can't push the number 86 upwards
much, so I was going to sort of "look the other way" by first selecting
5 authors out of 6, then 5 out of 7, then 5 out of 8, and so on. If I'm
measuring the variance of my statistic (which is a variance itself but
I was confusing myself writing it), then I can look to see if I get a
simple trend as the "maximum" number of authors increases, and then try
to predict from that. I haven't done the experiments, and hence don't
know what sort of graph I would get.

If I take the numbers:

-11 -11 -11 11 11 11

I get a standard deviation of 12.0499. If I take the numbers:

-11.119125 -6.206514 2.240835 11.833689 14.159427 18.644495

then I get a standard deviation of 11.91435, which is similar, but
smaller. The sets of numbers seem to be different to me, in particular
I'd have much more confidence in predicting the next number given the
results in the first set compared to the number in the second set.

Thinking about things, I *think* the central limit theorem will help me
out. The distribution of the variances of different experiments should
be ~ normal, no? If so, then a goodness of fit to normal would help me
distinguish the first and second cases above, with the second sample
being a better fit. Any unusual patterns such as "bunching" would
reduce the fit, ..., I think. And, I wouldn't have any ad-hoc decisions
to make as I would if I used a kernel density estimation.

> My other thought is that this looks the sort of problem you'll find
> discussed in books on the bootstrap.

Aha, I have such a book, *somewhere* around here. I do remember that
while the bootstrap is claimed to be a good estimator for many
statistics, that it doesn't work well for estimating the range. I can
look at how ranges

With the variance of the statistic, high and low range, and goodness of
fit to normal, I should have a fair amount of things to look at to see
what happens when I raise the "maximum" number of authors. I hope :-)

Thanks very much for your comments.

Cheers,

Ross-c



Relevant Pages

  • Re: What does SD analysis tell?
    ... HCP in random locations. ... The standard deviation for these are less than 0.05 trick ... of your estimator is ... location of the HCP is unknown the variance ...
    (rec.games.bridge)
  • Re: What is the optimal estimator?
    ... estimator for the DC level and the variance of the noise? ... If the value and noise variance are to be estimated, ... You're not even using the word "optimal", which would at least restrict ...
    (comp.dsp)
  • Re: Appropriate journal for circular statistics paper
    ... Circular Mean Resultant Length and its Variance". ... In the past, only its estimator, i.e. mean over a sample of angles, ... My paper first considers the squared MRL, finds that it is biased, ... However, there are many "general" statistics journals such as JRSS, JASA, Biometrika etc. that might welcome such a paper if it met their criteria, so you need not look for anything too specialised. ...
    (sci.stat.math)
  • Re: What is the optimal estimator?
    ... estimator for the DC level and the variance of the noise? ... If the value and noise variance are to be estimated, ... You're not even using the word "optimal", which would at least restrict ...
    (comp.dsp)
  • Re: consistency
    ... It doesn't matter what the analysis says. ... That the estimator is unbiased *and* the ... variance vanishes when the number of samples become large. ... One example of a non-consistent estimator the periodogram, ...
    (comp.dsp)