Re: Assumptions for another Fisher's exact test??

From: Richard Ulrich (Rich.Ulrich_at_comcast.net)
Date: 12/29/04


Date: Wed, 29 Dec 2004 12:38:44 -0500

On Wed, 29 Dec 2004 02:54:28 +0000 (UTC), bevin.keen@afit.edu (Bevin
Keen) wrote:

> Hi there! I am looking to use Fisher's exact test to test a
> hypothesis for my Masters thesis. I saw that other contingency table
> approaches require certain assumptions to be met in terms of a
> 'multinomial experiment' and the expected cell counts need to be
> greater than 5 in order to use the chi squared distribution.
>
> Since the Fisher's exact test is based on a hypergeometric
> distribution, are there certain assumptions that must be met?? What
> are they? I have a few Nonparametric texts, but they don't really
> address the assumptions for this test.

If you really want to read up on all the stuff about
analyzing 2x2 tables, search on < Yates-correction > .
What you find with groups.google in sci.stat.* might
be more focussed that what's on the web generally.

The Yates-corrected chisquared test gives a very good
approximation to Fisher's Exact Test. For small tables
(especially), these two tests will reject less often than
either the uncorrected X^2 or randomization tests
that do *not* assume "fixed margins."

If you have a median split, you have a fixed-margin on
the outcome variable. (A lot of margins aren't fixed; a lot
of people think you should always prefer Fisher's test, anyways.)

>
> I am including a the hypothesis that I am trying to test:
>
> For purposes of this research question, the hypotheses being tested
> are:
> H0 : p1 = p2, the response rate(Yes) for series B is equal to the
> response rate(Yes) for series C
> HA : ñ1 < p2, the response rate (Yes) for series B is less than the
> response rate (Yes) for series C.
> Where:
> P1= the probability that a question will result in a yes response for
> a B series evaluation
> P2= the probability that a question will result in a yes response for
> a C series evaluation
>
> [I have a 2 x 2 contingency table; the rows are time of the inspection
> (B series or C series) and the columns are Response (Yes or No) to
> inspection questions.

"Time of inspection", huh?
I can imagine two easy reasons why your design *might* not
properly give a 2x2 contingency table. If the same "objects"
are being inspected each time, then there should be paired
tests. If the inspections constitute a time-series that has
any series-correlation, then some analysis might deserve
time-series treatment.

On the other hand, if there are cell expectations less than 5,
the whole experiment might have too little power to be
worth worrying about, except for doing a test as a gross
indicator.

-- 
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html