Re: Two nit-picks re definition of p-value (Was: goodness of fit ?)



Reef Fish wrote:
Bruce Weaver wrote:
Reef Fish wrote:
Richard Ulrich wrote:
---- snip ----
I can't picture myself writing it that way.
I think I would have to say something like, "exceeding
the 20% cutoff, which we previously justified for this case."
Or some such.
The use of the technical language of Acceptance and Rejection,
together with the underlying meanings of alpha and p-values,
no special language is required to re-explain what is understood.

In the particular case, the 0.150 is the PROBABILITY that the
K-S Test Statistic is "more extreme" than the observed D, when
the null hypothsis (that the data is from a normal distribution) is
TRUE.
But I stumbled, slightly, when reading this sentence, because of the
present tense "is" -- It scans better for me like this,
"... probability that the K-S Statistic will be [or 'would be']
more extreme than the observed D, ..."
In that context, "is" and "will be" means exactly the same. :-) It
comes from the definition of p-value which is the
Probability ( Test Statistic > observed value of the Test Statistic
when
Ho is true).

Bob, I have two nit-picks.

Good! nit-picks often contribute to educational follow-ups.
1. I would say that definition works for one of the two directional
alternative hypotheses. For the other directional alternative, it would
be "less than", not "greater than". And for the non-directional
alternative, you would either have to invoke absolute values, or
describe it as "more extreme than the observed test statistic".

Where is your nit?

My nit is that you omitted cases 2 and 3 below when giving the definition of a p-value. Anyone who already understands hypothesis testing would not stumble over that omission; but someone who is learning it for the first time might stumble.


Ha: > p-value = P(TEST STAT. > observed TS)
Ha: < p-value = P(TEST STAT. < observed TS)
Ha: .NE. p-value = P(TEST STAT < - abs(observed TS))
+ P(TEST STAT > abs(observed TS)).

Completely unambiguous. 100% standard. I believe you missed
that part in your education about how to state "more extreme" in
p-values in hypothesis testing.

No, I didn't miss that. I describe the p-value as the probability of an outcome (or value of the test statistic) as extreme as or more extreme than the observed outcome (or observed test statistic), where more extreme means more favorable to the alternative hypothesis. With the exception of the "as extreme", I think that is in agreement you.



2. Isn't it really "EQUAL TO or greater than" rather than "greater
than"?

No! That's a nit pick on YOUR nit!

It depends on the STATEMENT of the Alternative Hypothesis.
If Ha is "greater than", the more extreme in p-value is "greater
than":
If Ha is "greater than or equal", the more extreme in p-value is
"greater
than or equal".

It makes no difference whether the test statistic is continuous OR
discrete. It never makes any difference when the test statistic
is continuous. But the STATEMENT and DEFINITION of p-value
follows directly the STATEMENT (and intent) of the Alternative
Hypothesis.


I know that for test statistics that have continuous sampling
distributions, the difference is trivial.

There would be NO difference. Probability at a point for a
continuous random variable (the Test Statistic) is always ZERO.

But not so for those with
discrete sampling distributions. Here's a binomial problem,

See the explanation above. No example is necessary.


for example, with a non-directional alternative hypothesis:

X = number of successes in N = 13 trials
p = p(success)
q = 1-p = p(failure)

H0: p EQ 0.5
H1: p NE 0.5

Observed X = 2

Whoa! What is your TEST STATISTIC?

My test statistic is X, the number of successes in N = 13 trials. It's sampling distribution under a true null hypothesis is a binomial distribution with N=13 and p = 0.5.


X p(X|H0)
---------------
0 .0001
1 .0016
2 .0095 <-- observed X
3 .0349
4 .0873
5 .1571
6 .2095
7 .2095
8 .1571
9 .0873
10 .0349
11 .0095
12 .0016
13 .0001
---------------

I was taught to include the 0.0095 when computing the p-value:

But did your professor explain WHY? If he had bothered to, using
the definition of a p-value (as I'll show below), he would have
realized
his own error. That's only one of the troubles of ROTE teaching and
ROTE learning, without understanding the reason(s) WHY.

The Alternative Hypothesis is Ha: p .NE. 0,5

So, the definition of p-value for that problem would be
P(# Succeeses "more extreme" than the observed "2" successes)

= P (# Successes > 2 ) + P( # Failures < 2)

P(# Successes < 2) = .0016 + .0001.
P(# Failures < 2) is also .0016 + .0001,

which is why you multiplied by 2.


p = (0.0095 + 0.0016 + 0.0001)*2 = 0.0224

This is NOT correct. .0095 does NOT belong to the "more extreme"
definition of a p-value.

Do you agree? Thanks for clarifying.

You are welcome. You have to come back with better nits than those!!
:-)

-- Reef Fish Bob.


Interesting. FWIW, SPSS and Stata both give p = 0.022461.

Binomial Test (from SPSS)
|-------|-------|--------|--|--------------|----------|---------------|
| | |Category|N |Observed Prop.|Test Prop.|Exact Sig. |
| | | | | | |(2-tailed) |
|-------|-------|--------|--|--------------|----------|---------------|
|success|Group 1|1 |2 |.15 |.50 |.022461 |
| |-------|--------|--|--------------|----------|---------------|
| |Group 2|0 |11|.85 | | |
| |-------|--------|--|--------------|----------|---------------|
| |Total | |13|1.00 | | |
|-------|-------|--------|--|--------------|----------|---------------|


bitesti 13 2 .5 (Stata)

N Observed k Expected k Assumed p Observed p
------------------------------------------------------------
13 2 6.5 0.50000 0.15385

Pr(k >= 2) = 0.998291 (one-sided test)
Pr(k <= 2) = 0.011230 (one-sided test)
Pr(k <= 2 or k >= 11) = 0.022461 (two-sided test)


Any SAS or R users care to jump in?

--
Bruce Weaver
bweaver@xxxxxxxxxxxx
www.angelfire.com/wv/bwhomedir
.



Relevant Pages