Re: Hypothesis Testing: the TEST STATISTIC
- From: Jack Tomsky <jtomsky@xxxxxxxxxxxxx>
- Date: Fri, 18 Aug 2006 13:26:01 EDT
At least two of the professors' silence (Rubin and
Tomsky) on
Afonso's blunder was probably because they are so far
up the
stratoshperes of academia that they have never taught
that
lowly topic [testing the equality of two proportions]
(usually
at the Freshman level introductory course). The
other
professors are probably not even aware of the issue
(or the
topic), or not sure what the fuss is about. Else
they surely
could have pointed Afonso to a chapter or section of
a
textbook that deals with the particular topic.
Just to correct the record, I have never been a professor. I have always worked in private industry, often in R&D. If I had to correct all of Afonso's errors, it would take all day and I would never get any work done.
In a classical Hypothesis Testing set up, the HIDDEN
principle is that the TEST STATISTIC must assume Ho
to
be true in the execution of the test!
Reason? The TEST STATISTIC is used to determine
both the p-value and whether Ho is accepted or
rejected
at a fixed alpha level.
alpha = P( Ho is rejected | Ho is true).
Since the TEST STATISTIC is used to determine if Ho
is
to be accepted or rejected given the rejection region
determned by alpha, it MUST assume the values of the
tested parameters at Ho <hence assume Ho true>.
It is perhaps even more obvious because the TEST
STATISTIC determines the p-value of the test.
p-value = Pr( TEST STATISTIC is "more extreme" than
observed value of the test statistic WHEN Ho is
TRUE).
Therefore, whether one uses a fixed leval alpha or a
p-value, in order for a TEST STATISTIC to serve its
purpose relative to those measures of Type I error,
the test statistic itself must incorporate the Ho
values.
For MOST tests of a hypothesis, such as Ho: mu = 0,
the same FORM of the test statistic is used, for both
hypothesis testing and confidence intervals for the
parameter.
In case of testing a mean, the T statistic is often
used.
"mu = 0" MUST be incorporated in the TEST STATISTIC.
That is why the T-statistic is (xbar -
0)/(s/sqrt(n)), where
s is the standard deviation of the variable X.
Whatever
the value of mu is in Ho does not affect the estimate
s,
which is independent of xbar, both of which enter s.
So, for the execution of that test, one can in fact
EITHER
execute a formal test of the hypothsis Ho, OR
construct
a confidence interval for mu and see if it covers
mu=0.
The two methods are EQUIVALENT.
That is in fact the source of Afonso's blunder in the
problem of testing the equality of two proportions,
because Afonso doesn't know the elements of a
Hypothesis Test. He ALWAYS relates a test to the
corresponding Confidence Interval, and his confidence
intervals are always two-sided. (One tailed tests
correspond to one-sided confidence intervals)
The problem of the DIFFERENCE of two proportions
is one exception to the usual rule of the equivalence
of
a test and a confidence interval. The are NOT
equivalent in that case.
The variance of a sample proportion p1-hat= x1/n1.
denoted by p1* say, is p1*(1-p1*)/n1.
Similarly for p2=x2/n2: p2*(1-p2*)/n2.
Thus, for two independent sample proportions, the
variance of the DIFFERENCE is
vd = p1*(1-p1*)/n1 + p2*(1-p2*)/n2.
vd (no pun intended <g>) is what Afonso has, and the
ONLY variance of the difference he knows.
For LARGE samples where Z approximates the sampling
distribution of (p1* - p2*)/sqrt(vd), the two sided
CONFIDENCE INTERVAL for (p1 - p2) is given by
(p1* - p2*) +- z(alpha/2) * sqrt(vd), for the
alpha level.
But the GENERAL PRINCIPLE for Hypothesis testing
that Ho must be incorporated into the TEST STATISTIC
comes into play in the difference of proportions
TEST,
when testing the difference is ZERO.
In the case of testing H: mu = 0 from a normal distribution with unknown variance, the t statistic is
t = sqrt(n)*Xbar/Sqrt(Sumsq(Xi-Xbar)/(n-1)).
On the surface it might appear that this conflicts with Bob's principle that you when you estimate the variance, you must assume that mu = 0, resulting in
W = sqrt(n)*Xbar/Sqrt(Sumsq(Xi)/n)).
However, since |t| is a monotonic increasing function of |W|, both tests are equivalent.
It is no longer appropriate to use the same vd for
confidence interval of the difference as the variance
in the denominator of the test statistic, because
when
Ho is true, p1 = p2, and vd uses two different
estimates
for the variance of a SINGLE proportion p.
Therefore, for testing p1 = p2, one must use the
variance
of their common p in the TEST STATISTIC.
In pooling the two samples to get ONE sample
proportion,
the common proportion is p** = (x1+x2)/(n1+n2),
and the variance of which is given by v** = p**
(1-p**)/(n1+n2)
and the TEST STATISTIC = Z = (p1* - p2*)/sqrt(v**).
v** is the estimated variance of p**. What you need in your Z statistic is the estimated variance of p1* - p2*, which is
p**(1-p**)[1/n1 +1/n2].
Jack
-- Reef Fish Bob..
- Follow-Ups:
- Re: Hypothesis Testing: the TEST STATISTIC
- From: Reef Fish
- Re: Hypothesis Testing: the TEST STATISTIC
- References:
- Hypothesis Testing: the TEST STATISTIC
- From: Reef Fish
- Hypothesis Testing: the TEST STATISTIC
- Prev by Date: Re: Hypothesis Testing: the TEST STATISTIC
- Next by Date: Re: Jack are you a MAN or a RAT?
- Previous by thread: Re: Hypothesis Testing: the TEST STATISTIC
- Next by thread: Re: Hypothesis Testing: the TEST STATISTIC
- Index(es):
Relevant Pages
|