Computing Chi-square stats on Card deck



No-one seems to have answered this, so I'll have a quick flick.
You might have got a bigger response in sci.math.stat, but anyway:

-
>>>>
Perform 100,000 shuffles and, and check the sequence of cards
in each. A table is created as follows :

Columns correspond to the position of the card from the top of deck,
Rows correspond to Cards. The frequency of occurrence of each card
in each position is tabulated as :

Position -> 1 2 3 4 ...... 52
A-Clubs 1992 1925 1966 1924 ..... 1941
2-Clubs 1918 1916 ...........................
3-Clubs 1849 1973 ...........................
.......
K-Spades 1912 1974 ..........................

Expected value of each cell = 100,000/52

Chi-square is calculated for each cell as
(cellvalue-exp.value)^2 / exp. value
The values in all cells are totalled, and the Chi square
statistic is calculated on the total with degrees of freedom
as 51 * 51 (51 being No. of rows-1, and 51 being No. of columns -1)
<<<<
-

All this is correct. The key thing about the DoF being 51*51
is that that is the number you (or the gods) can freely choose,
before the others all then become fixed by the constraints of
the problem. In this case, those constraints are that all rows
and all columns must add up to 100,000.

> Now my doubt is :
> On row 1, we're creating the statistics for a single card - namely
> the Ace of clubs. When we come to the second column of that row,
> the first card has already been dealt, and we have only 51 cards left.

This is true, but only if you persist in treating the 1st row
as a bunch of independent RVs, and the 2nd row as being a bunch
of CONDITIONAL RV's, conditional on the values of the first, etc.

But this is artificially singling out one row; (and in fact all
rows according to your permutation order, 1,2,3,4... in this case.)
There is no need to do this. You CAN do it but it just makes
the whole thing hopelessly intractable. Best to treat all the rows
as "exchangeable", (not quite independent), so they are all on
the same footing. This is standard. The physical order you
did things in is quite irrelevant to anything statistical.

> So shouldn't the expectancy be (100000-1992)/51 instead of 10000/52 ?

So, no.
They are excahngeable, so all have the same distribution, mean etc.

> The reason this doubt arose is because I've seen the same analysis
> being performed by using only "n" cards drawn from the top of the deck
> ("n" being typically a small number like say 7 cards),
> and when we do that the degrees of freedom assumed is (n * 51).

Yes, the key thing is about the degrees of freedom is as I said above.
51*n is correct here.

> [as] the frequency values in the last column are completely predictable

Yes; but the figures for ANY column are fixed, GIVEN all the others.

It sounds like you want to "leave out" the last column from your
calculations, but again, this would be giving unwarranted significance
to some particular column. You must not do this. It seems funny,
I know, to be adding up 52 of something when only 51 are "really there"
in some sense. But it is in fact the correct thing to do, though
exactly WHY it is correct is not well explained (i.e. it is usually
glossed over!) in lecture courses and even text books.

One can get a glimpse of why, without TOO much work, if you work
out the precise theory for the case of n=2. There, though there is
only one df, one nevertheless has to add up BOTH figures (even though
they are effectively the same!), to get the exactly right answer.

HTH.
-------------------------------------------------------------------------
Bill Taylor W.Taylor@xxxxxxxxxxxxxxxxxxxxx
-------------------------------------------------------------------------
Yes it may be easy to lie with statistics,
but it's easier still to lie without them!
-------------------------------------------------------------------------

.



Relevant Pages

  • Random Number Generator - Blackjack
    ... You can "deal" random cards easy ... >I am learning to play blackjack. ... In each cell I inputed the proper ... >don't know how to get the random function to pull alpha ...
    (microsoft.public.excel.misc)
  • Re: How do I make A6 tent cards?
    ... Assuming you're actually printing on A6 and cutting the cards apart... ... Set the document to landscape orientation, and set the minimum margins ... Click in the top left cell of the table. ... Save this as a template, ...
    (microsoft.public.word.newusers)
  • Re: Making a Business Card
    ... In Word, you go to Envelopes and Labels, choose the Avery stock number from Label Options, and then click New Document to get a full sheet of cards. ... You may need to split each cell into two rows, then split the top half of the cell into two columns to approximate the layout you want. ... Nuclear power -- 2 pounds in a coke can ...
    (microsoft.public.word.newusers)
  • Re: What up with 9V Alkaline Batteries?
    ... This month the cheapest I could find were cards of 4 for $11 making ... D cells used to be extremely common because they were used in flashlights. ... Hardly anyone uses D cell flashlights anymore. ...
    (alt.home.repair)
  • Need Some Help From Mathematicians or Statisticians
    ... I've taken the example that if I set out to deal thirteen cards and get ... This, however, ignores the point that if the first card is not a spade, then ... calculations reduces the total time from 1 million years to just under 300K ... one million people were to do the exercise simultaneously, ...
    (talk.origins)