Re: Comparing empirical frequency distributions
- From: Niklaus Kuehnis <kuehnik_0505@xxxxxxxxxxxxxx>
- Date: 27 Aug 2008 12:52:15 GMT
Peter / Labo <peterpc5j@xxxxxxxxxxx> wrote:
Question 1:
Is there any reasonable way to compare the frequency distributions of
prod1 vs. prod2 regarding all adjectives in a single test?
Question 2:
I thought of comparing the distributions of prod1 and prod2 using a
chisquare goodness-of-fit test. This test is usually referred to as a
test to compare an empirical to a theoretical distribution. Can I use
it to compare two empirical distributions?
Thanks in advance!
Niklaus
Yes you can. Look into the chi-square test for independence of nominal/
categorical variables. The difference with the 50/50 test is that you
would take into account the marginal distribution not only per
adjective, but also per product. The relevance of this is that you
will be taking into account possible biases towards responding product
1 or product 2 exactly because they are presented as the first and the
second. Such order biases do exist. The drawback is a probably
relatively small loss in power if the bias does not exist in your case
(for example because you have counterbalanced every possible aspect
that might bias towards product 1 or product 2). You just have to make
sure that you get the degrees of freedom right. And just a caveat: the
number of "don't know" responses should not be too low. Otherwise the
chi-square approximation for the distribution of the Pearson statistic
doesn't work well. As a rule of thumb, based on Cochran (1957): less
than 20% of your cells should have an expected frequency of less than
5. The expected frequency, for tests of this kind of indepence, is the
product of the marginal frequencies divided by the total (f.e. #tasty
* # product 1 / (200*30) ). If your "don't know" category creates
troubles according to this criterion, I would just drop it, as it's
not informative to your actual question. In fact, it might even be
misleading, leading to significance for systematic differences in the
use of "don't know" across attributes rather than a difference between
products. If you really would have a lot of near-0 frequencies, you
will have to resort to statistical software: look for "exact tests of
independence". If your PC chokes on this, or melts, go look for
"bootstrap" etc. If your example numbers are realistic, you won't need
this.
Thanks for reposting!
I am hesitant to use a traditional k*l-Chisquare because my textbooks
say it is a test for independence of *two* categorical
variables. In contrast, I have 31 variables, and my rows are
not independent because they stem from the same subjects.
If I correctly understand your post, you say a k*l-Chisquare makes
sense to test if the two distributions of prod1 and prod2 are equal.
--
Niklaus
.
- References:
- Re: Comparing empirical frequency distributions
- From: Peter / Labo
- Re: Comparing empirical frequency distributions
- Prev by Date: Re: Comparing empirical frequency distributions
- Next by Date: Re: Comparing empirical frequency distributions
- Previous by thread: Re: Comparing empirical frequency distributions
- Next by thread: Re: Comparing empirical frequency distributions
- Index(es):
Relevant Pages
|
Loading