Corollary: N-P Silliness in Estimation Theory (was: Re: Unusual formulae for confidence intervals)




David Jones wrote:
Reef Fish wrote:

Why was it that NO ONE challenged or commented about my
comment on the use of (n-1), n, and (n+1) as the THREE most
commonly used denominators for S^2, for reasons of ESTIMATION
criteria? Answer: probably none of the discussant know about
that one. :-)

The use of "n+1" is certainly peculiar to those who don't use it,
but
there is nothing "peculiar" about it for those who understand that
"unbiased estimate" and "maximum likelihood estimate" are just
TWO of the main THREE criteria in statistical POINT estimation!


Well, I thought it was just a misinterpretation of what was in the
original post ...
where I think (subject to my own possible misinterpretation and
without looking back) the "S" in question was meant itself to be a
sample variance (of some form)

No. There is no ambiguity in S (by the OP) being the sample
STANDARD DEVIATION which is the square root of the sample
variance).

In retrospect, I believe the post by Koopman (initial follow-up, before
my posts of #3 and #4 in this thread) pointed out the major source of
the confusion:

RK> How did they define S^2? With n in the denominator?

That remains, and probably shall remain, forever, the AMBIGUOUS
statistical usage of the term "sample variance".

There was a recent lengthy thread about it where Afonso argued
incessantly that it's the population variance when divided by N and
sample variance when divided by (N-1) which was unambiguously
FALSE, of course. :=) But that was to be expected of Afonso
who could never tell the difference between a population parameter
and a sample estimate, resulting in whch nonsense as stating a
hypohtesis as Ho: 3/13 < 8/13. :-)

But the REAL culprit is that in the "definition" of a "sample variance"
both N and (N-1) are used, with the only emphsis that it's calculated
from sample data!

S^2 is the SS deviations divided by (N-1) when it is the sample
variance, which is an UNBIASED estimate of the population
variance.

S^2 is the SS deviations divided by N when it is the sample variance
which is the maximum likelihood estimate of the population
variance.

To eliminate the "sample variance" ambiguity in common usage, one
would have to say something not only clumsy but silly, like:

S^2 is the sample variance (unbiased) when the denominator is (N-1).
S^2 is the sample variance (MLE) when the denominator is N.

and both assume the population mean is unknown also.

To FURTHER complicant the ambiguous usage, we have the notion
of a "standard error" (estimated standard deviation) of the sample
mean, namely, sqrt(sigma^2/N) estimated by sqrt(S^2/N).

That is where Koopman's comment, and later re-iterated by me

RF> That was a VALID point made by Koopman about
RF> the ambiguous usage of S that probably led to some of the
RF> present confusion about (n-1) and n, because S/sqrt(n-1)
RF> and S/sqrt(n) could be IDENTICAL if different estimates were
RF> used for S.


and not an unscaled sum of squares, and
that the expression was meant to be used to decide a sample size to
estimate a mean with a given precision, in which case a factor of 1/n
would be usual (with a factor of 1/(n-1) contained in the calculation
of the sample variance). I couldn't see why 1/(n-1) would appear in
the formula given except perhaps as some form of allowance that would
be better done using percentage points of a t-distribution.

The DISTRIBUTION of S^2 an entirely separate issue, from that of
the meaning and definition of S^2, the sample variance, from which
the square root is used as the estimated STANDARD DEVIAITON.

What form of S^2 to use is defined by the criterion of estimation!

But in a sense the MOST "peculiar" of all these concepts is the
N-P theorists' pre-occupation of the notion of an UNBIASED estimate.
E(statistic) = population parameter to be estimated.

Here, E ( SSE/(n-1)) = sigma^2, hence an unbiased estimate.

but the SQUARE ROOT of S^2 is a BAISED estimate of sigma
whether you use n, (n-1), or (n+1) as the denominator.

The only person in this newsgroup who claimed to have used
the unbiased estimate for sigma was Jack Tomsky, who used it
in an indirect way, when he estimated the quantile of a
distribution.

I thought the insistence on UNBIASED estimates for sigma-squared,
when the whole world is using S which is BIASED, for sigma, has
got to be one of the most peculiar and the SILLIEST notion ever
perpetrated and N-P theory of estimation in statistics.


Of course the use of (n-1), n, and (n+1) as possible denominators for
S^2 (where this is the sum of squared errors) should be well-known. I
don't know whether there are any simple results available from the
theory which would cover the type of two-stage sampling being
contemplated here to say how one might use an initial sample to choose
the size of a second sample in some "optimal" way.

David Jones

When you used the term "optimal", you are automatically opening
more cans of statistical worms. :-) The priests of unbiased
estimates
would undoubted say that they are the "optimal" estimate to use
because it gives rise to all kinds of OTHER UMP optimality or
other inherent "defects" of the Neyman-Pearson-Fisher school of
Statistics.

The MLE does have the natural and commonsense advantage (as
an estimation criterion) that it is INVARIANT to nonlinear
transmations.

If S^2 is MLE for sigma^2; S raised to any power p different from 1
remains the MLE for sigma^p.

Since the N-P priests of unbiasedness can't have their cake and
eat it too, they will remain the brunt of the JOKE that the whole
world is using the BIASED estimate for sigma, and every time you
change the power of p for sigma^p, an entirely different
statistic has to be used to make that form of estimate unbiased.

So, there are INFINITELY MANY different unbiased estimates
for the infinitely many unknown parameters sigma^p, when there
is exactly ONE unknown sigma.

-- Reef Fish Bob.

.



Relevant Pages

  • Re: An ethical MUST
    ... Is this Bob´s Reef Fish blunder: ... Or the sample variance is, by definition, ... And the unbiased estimate of the ... ___or Bob did try to cheat. ...
    (sci.stat.math)
  • Re: Do you want it?
    ... "Now consider the sample variance, ... RF> some estimation criterion. ... RF> For (N-1), # = Unbiased Estimate. ... mean, M, is known and the unbiased estimate for the covariance ...
    (sci.stat.math)
  • An ethical MUST
    ... Is this Bob´s Reef Fish blunder: ... And the unbiased estimate of the Variance is generally taken as ... IF only few Statisticians call ABUSIVELY the former quantity *sample variance* it was an ethical MUST to point this feature. ... ___or Bob did try to cheat. ...
    (sci.stat.math)
  • [OT][Long][You All Know Everything Dept] Recommendations for Stat book?
    ... population variance and sample variance is, and that in case of one you ... the other by n-1. ... As a mathematically curious person with college calculus ... author mentions how he spent an inordinate amount of time researching ...
    (comp.lang.lisp)
  • Re: The sum of square deviations: A THEOREM
    ... sigma): N - a set of simulations was performed ... to find some features concerning W. ... or its relevance to the SAMPLE VARIANCE thread? ... Besides, I've rejected the hypothesis that Afonso is a HUMAN person, ...
    (sci.stat.math)