Re: Two nit-picks re definition of p-value (Was: goodness of fit ?)




Bruce Weaver wrote:
Reef Fish wrote:
Bruce Weaver wrote:
Reef Fish wrote:
Richard Ulrich wrote:
---- snip ----
I can't picture myself writing it that way.
I think I would have to say something like, "exceeding
the 20% cutoff, which we previously justified for this case."
Or some such.
The use of the technical language of Acceptance and Rejection,
together with the underlying meanings of alpha and p-values,
no special language is required to re-explain what is understood.

In the particular case, the 0.150 is the PROBABILITY that the
K-S Test Statistic is "more extreme" than the observed D, when
the null hypothsis (that the data is from a normal distribution) is
TRUE.
But I stumbled, slightly, when reading this sentence, because of the
present tense "is" -- It scans better for me like this,
"... probability that the K-S Statistic will be [or 'would be']
more extreme than the observed D, ..."
In that context, "is" and "will be" means exactly the same. :-) It
comes from the definition of p-value which is the
Probability ( Test Statistic > observed value of the Test Statistic
when
Ho is true).

Bob, I have two nit-picks.

Good! nit-picks often contribute to educational follow-ups.
1. I would say that definition works for one of the two directional
alternative hypotheses. For the other directional alternative, it would
be "less than", not "greater than". And for the non-directional
alternative, you would either have to invoke absolute values, or
describe it as "more extreme than the observed test statistic".

Where is your nit?

My nit is that you omitted cases 2 and 3 below when giving the
definition of a p-value. Anyone who already understands hypothesis
testing would not stumble over that omission; but someone who is
learning it for the first time might stumble.


Ha: > p-value = P(TEST STAT. > observed TS)
Ha: < p-value = P(TEST STAT. < observed TS)
Ha: .NE. p-value = P(TEST STAT < - abs(observed TS))
+ P(TEST STAT > abs(observed TS)).

Completely unambiguous. 100% standard. I believe you missed
that part in your education about how to state "more extreme" in
p-values in hypothesis testing.

No, I didn't miss that. I describe the p-value as the probability of an
outcome (or value of the test statistic) as extreme as or more extreme
than the observed outcome (or observed test statistic), where more
extreme means more favorable to the alternative hypothesis.

See the difference below. It depends on how the Ha is stated!

Actuallly. that's NOT correct, for the two tailed alternative, nor the
one tailed alternative, for that matter. It means "more extreme" as I

stated above. Otherwise, how would you statement "more extreme"
when the observed value is -1.2 say?

With the
exception of the "as extreme", I think that is in agreement you.



2. Isn't it really "EQUAL TO or greater than" rather than "greater
than"?

No! That's a nit pick on YOUR nit!

It depends on the STATEMENT of the Alternative Hypothesis.
If Ha is "greater than", the more extreme in p-value is "greater
than":
If Ha is "greater than or equal", the more extreme in p-value is
"greater
than or equal".

It makes no difference whether the test statistic is continuous OR
discrete. It never makes any difference when the test statistic
is continuous. But the STATEMENT and DEFINITION of p-value
follows directly the STATEMENT (and intent) of the Alternative
Hypothesis.


I know that for test statistics that have continuous sampling
distributions, the difference is trivial.

There would be NO difference. Probability at a point for a
continuous random variable (the Test Statistic) is always ZERO.

But not so for those with
discrete sampling distributions. Here's a binomial problem,

See the explanation above. No example is necessary.


for example, with a non-directional alternative hypothesis:

X = number of successes in N = 13 trials
p = p(success)
q = 1-p = p(failure)

H0: p EQ 0.5
H1: p NE 0.5

Observed X = 2

Whoa! What is your TEST STATISTIC?

I caught that later but forgot to cancel the line above.

My test statistic is X, the number of successes in N = 13 trials. It's
sampling distribution under a true null hypothesis is a binomial
distribution with N=13 and p = 0.5.

That was what I took as your test statistic, on 13 trials.


X p(X|H0)
---------------
0 .0001
1 .0016
2 .0095 <-- observed X
3 .0349
4 .0873
5 .1571
6 .2095
7 .2095
8 .1571
9 .0873
10 .0349
11 .0095
12 .0016
13 .0001
---------------

I was taught to include the 0.0095 when computing the p-value:

But did your professor explain WHY? If he had bothered to, using
the definition of a p-value (as I'll show below), he would have
realized
his own error. That's only one of the troubles of ROTE teaching and
ROTE learning, without understanding the reason(s) WHY.

The Alternative Hypothesis is Ha: p .NE. 0,5

So, the definition of p-value for that problem would be
P(# Succeeses "more extreme" than the observed "2" successes)

= P (# Successes > 2 ) + P( # Failures < 2)

P(# Successes < 2) = .0016 + .0001.
P(# Failures < 2) is also .0016 + .0001,

which is why you multiplied by 2.


p = (0.0095 + 0.0016 + 0.0001)*2 = 0.0224

This is NOT correct. .0095 does NOT belong to the "more extreme"
definition of a p-value.

Do you agree? Thanks for clarifying.

You are welcome. You have to come back with better nits than those!!
:-)

-- Reef Fish Bob.


Interesting. FWIW, SPSS and Stata both give p = 0.022461.

So they are both WRONG! Did SPSS have an explanation on why
the p value is .022461, rather than 0.0034?


Binomial Test (from SPSS)
|-------|-------|--------|--|--------------|----------|---------------|
| | |Category|N |Observed Prop.|Test Prop.|Exact Sig. |
| | | | | | |(2-tailed) |
|-------|-------|--------|--|--------------|----------|---------------|
|success|Group 1|1 |2 |.15 |.50 |.022461 |
| |-------|--------|--|--------------|----------|---------------|
| |Group 2|0 |11|.85 | | |
| |-------|--------|--|--------------|----------|---------------|
| |Total | |13|1.00 | | |
|-------|-------|--------|--|--------------|----------|---------------|


bitesti 13 2 .5 (Stata)

N Observed k Expected k Assumed p Observed p
------------------------------------------------------------
13 2 6.5 0.50000 0.15385

Pr(k >= 2) = 0.998291 (one-sided test)
Pr(k <= 2) = 0.011230 (one-sided test)
Pr(k <= 2 or k >= 11) = 0.022461 (two-sided test)


Any SAS or R users care to jump in?

Does it REALLY matter? SPSS or any other Statistical PACKAGE is
the last place anyone wants to learn Statistics!

You learn it from the correct definition or you don't.

If I am not mistaken, all of the Regression packages always give the
TWO-TAILED value as the p-value. Why?

If anything, they should give all THREE p-values, as a function of the
three possible alternatives.

-- Reef Fish Bob.

--
Bruce Weaver
bweaver@xxxxxxxxxxxx
www.angelfire.com/wv/bwhomedir

.



Relevant Pages


Loading