Re: Statistical Tests for Counts, varying size

clemenr_at_wmin.ac.uk
Date: 02/23/05


Date: 23 Feb 2005 11:17:06 -0800

rbenton@ieee.org wrote:
> Ross-c,
>
> The point on testing equality is likely right. Rephrased, what I
> need to test for is if B is significantly smaller than A.
>
> As for the chi-squared test, I was under the vague (and it seems
> incorrect) assumption you needed to be testing for three or more
groups
> (A, B, and C). May need to relook at that.

Hmmm.... I did give a health warning at the beginning that I am no
expert, and attempting to answer questions is often more of a learning
process for myself rather than the person I "answer".

I've never heard of chi-squared test being limited to three or more
categories. The first example on the page:

http://helios.bto.ed.ac.uk/bto/statistics/tress9.html

shows its use on a two-category case. But, as it turns out, this is not
the problem....

> Size-wise, the sample size of Y (where we obtain count B) is 5-10
> (most likely). The sample size for X should be about 100. So, would
> the chi-test (or any of the above mentioned) be valid?

As I said, I'm no expert. However, your Y sample is so small that this
in itself makes the chi-squared test a bad choice. From memory, you
shouldn't use a chi-squared test if more than 80% of the cell counts is
going to be 5 or less, or if any of the data points is zero.

It could be that you could use some form of exact test. Basically, you
have only 11 possible outcomes (between 0 and 10 item 1s in your sample
Y). It may well then be possible for you to calculate the exact
probability of each of these possible outcomes under the assumption
that the proportion is the same as for sample X. Exactly how you will
calculate these probabilities will depend on the properties that you
are sampling. E.g. if you're sampling from a fixed size population
without replacement, then the hypergeometric distribution may be of
use. If you're sampling from a fixed size population with replacement
(or if the population is really big) then the binomial distribution
might be of use. If your samples are dependent, rather than indepdent,
then things get much more sticky.

But, an exact test goes like this (from memory, hopefully someone will
correct me if I'm wrong).

(i) Calculate the exact probability of all possible outcomes under the
assumption of the null hypothesis.
(ii) Look at the observed outcome.
(iii) Sum up the probabilities of the observed outcome and all more
extreme outcomes.
(iv) Reject the null hypothesis if this probability is less than the
cutoff value of your choice.

Try googling on "Fisher's exact test" for more info.

Given the size of your samples, I also think I'll withdraw my
suggestion of using a resampling approach.

> Thanks,
>
> Ryan

Hoping this is something vaguely approximating help,

Cheers,

Ross-c



Relevant Pages

  • Re: Boy Scouts make people nervous
    ... any probability. ... coin is not really significant because the coin is simply being used ... outcomes does not necessarily mean equal probability. ...
    (rec.martial-arts)
  • Re: Is State Vector Reduction a Process?
    ... outcomes rather than the description of the evolution of individual ... we choose the observables (the set of outcomes is the ... >> So you are saying that QM theory does not explain the preferred basis. ...
    (sci.physics.research)
  • Re: Boy Scouts make people nervous
    ... any probability. ... Yes, you can, based on the number of possible outcomes. ... Yes, true, in the case of a real coin, but the variance in the real ... left hander dodging right. ...
    (rec.martial-arts)
  • Re: predicting outcomes was: Re: RMA is dying (and Kirk and shuurai are lying)
    ... There aren't two outcomes, ... What probability model will apply? ... large sample size what will be your distribution? ... conclude that you will have a higher number of occurances of Rs or Ls? ...
    (rec.martial-arts)
  • Re: Two nit-picks re definition of p-value (Was: goodness of fit ?)
    ... That's a nit pick on YOUR nit! ... If Ha is "greater than or equal", the more extreme in p-value is ... As I recall, in Fisher's Exact Test, the probability of the observed ...
    (sci.stat.math)

Loading