Re: Statistical Tests for Counts, varying size
clemenr_at_wmin.ac.uk
Date: 02/23/05
- Next message: dave_at_autobox.com: "Re: autocorrelation as a predicor in multiple regression models"
- Previous message: Ray Koopman: "Re: autocorrelation as a predicor in multiple regression models"
- In reply to: rbenton_at_ieee.org: "Re: Statistical Tests for Counts, varying size"
- Next in thread: Richard Ulrich: "Re: Statistical Tests for Counts, varying size"
- Messages sorted by: [ date ] [ thread ]
Date: 23 Feb 2005 11:17:06 -0800
rbenton@ieee.org wrote:
> Ross-c,
>
> The point on testing equality is likely right. Rephrased, what I
> need to test for is if B is significantly smaller than A.
>
> As for the chi-squared test, I was under the vague (and it seems
> incorrect) assumption you needed to be testing for three or more
groups
> (A, B, and C). May need to relook at that.
Hmmm.... I did give a health warning at the beginning that I am no
expert, and attempting to answer questions is often more of a learning
process for myself rather than the person I "answer".
I've never heard of chi-squared test being limited to three or more
categories. The first example on the page:
http://helios.bto.ed.ac.uk/bto/statistics/tress9.html
shows its use on a two-category case. But, as it turns out, this is not
the problem....
> Size-wise, the sample size of Y (where we obtain count B) is 5-10
> (most likely). The sample size for X should be about 100. So, would
> the chi-test (or any of the above mentioned) be valid?
As I said, I'm no expert. However, your Y sample is so small that this
in itself makes the chi-squared test a bad choice. From memory, you
shouldn't use a chi-squared test if more than 80% of the cell counts is
going to be 5 or less, or if any of the data points is zero.
It could be that you could use some form of exact test. Basically, you
have only 11 possible outcomes (between 0 and 10 item 1s in your sample
Y). It may well then be possible for you to calculate the exact
probability of each of these possible outcomes under the assumption
that the proportion is the same as for sample X. Exactly how you will
calculate these probabilities will depend on the properties that you
are sampling. E.g. if you're sampling from a fixed size population
without replacement, then the hypergeometric distribution may be of
use. If you're sampling from a fixed size population with replacement
(or if the population is really big) then the binomial distribution
might be of use. If your samples are dependent, rather than indepdent,
then things get much more sticky.
But, an exact test goes like this (from memory, hopefully someone will
correct me if I'm wrong).
(i) Calculate the exact probability of all possible outcomes under the
assumption of the null hypothesis.
(ii) Look at the observed outcome.
(iii) Sum up the probabilities of the observed outcome and all more
extreme outcomes.
(iv) Reject the null hypothesis if this probability is less than the
cutoff value of your choice.
Try googling on "Fisher's exact test" for more info.
Given the size of your samples, I also think I'll withdraw my
suggestion of using a resampling approach.
> Thanks,
>
> Ryan
Hoping this is something vaguely approximating help,
Cheers,
Ross-c
- Next message: dave_at_autobox.com: "Re: autocorrelation as a predicor in multiple regression models"
- Previous message: Ray Koopman: "Re: autocorrelation as a predicor in multiple regression models"
- In reply to: rbenton_at_ieee.org: "Re: Statistical Tests for Counts, varying size"
- Next in thread: Richard Ulrich: "Re: Statistical Tests for Counts, varying size"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|