Re: Computing Chi-square stats on Card deck
- From: Mack <macckone@xxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 17 Jan 2006 16:48:35 GMT
On 16 Jan 2006 05:27:06 -0800, bkk.ngroup@xxxxxxxxx wrote:
>Hi,
>
>I'm trying to determine the randomness of a particular PRNG
>and shuffling technique (on a deck of cards) using Chi-Squared
>tests. A bit of confusion regarding the computation methodology -
>The following was the procedure used by a colleague:
>
>Perform 100,000 shuffles and, and check the sequence of cards
>in each. A table is created as follows :
>
>Columns correspond to the position of the card from the top of deck,
>Rows correspond to Cards. The frequency of occurrence of each card
>in each position is tabulated as :
>
>Position -> 1 2 3 4 ...... 52
>A-Clubs 1992 1925 1966 1924 ..... 1941
>2-Clubs 1918 1916 ...........................
>3-Clubs 1849 1973 ...........................
>......
>K-Spades 1912 1974 ..........................
>
>Expected value of each cell = 100,000/52
>
>Chi-square is calculated for each cell as
>(cellvalue-exp.value)^2 / exp. value
>The values in all cells are totalled, and the Chi square
>statistic is calculated on the total with degrees of freedom
>as 51 * 51 (51 being No. of rows-1, and 51 being No. of columns -1)
>
>Now my doubt is :
>
>On row 1, we're creating the statistics for a single card - namely the
>Ace of clubs. When we come to the second column of that row, the first
>card has already been dealt, and we have only 51 cards left. So
>shouldn't
>the expectancy be (100000-1992)/51 instead of 10000/52 ? (And similarly
>for the 3rd row (10000-3917)/50 and so on ?).
>
>The reason this doubt arose is because I've seen the same analysis
>being
>performed by using only "n" cards drawn from the top of the deck ("n"
>being
>typically a small number like say 7 cards), and when we do that the
>degrees
>of freedom assumed is (n * 51). So as "n" is increased, at what point
>do we
>shift the column multiplier for the degrees of freedom from "n" to
>"n-1" ?
>(When all 52 cards are used, the column multiplier has to be obviously
>51
>because the frequency values in the last column are completely
>predictable).
>
>If you use the modified expectancy figures as shown in the previous
>para, the
>chi-square values in the last column are anyway 0, so column 52 can be
>omitted
>altogether which seems to justify the use of the value 51 for the
>column
>multiplier.
>
>Could anyone knowledgeable enough clarify which is the correct way in
>which
>the statistic should be computed in this case ?
>
>Thanks in Advance
>
>BK
You are trying to analyze a correlated sequence as an uncorrelated
group of card positions. Rather than taking the whole deck you would
do better to sample the cards. That is take only the first card or
cut the deck and take the cut card. Calculating the statistics the
for the method you are using is going to take a much more involved
analysis than I have the time to complete.
Mack
Leslie 'Mack' McBride
remove text between _ marks to respond via e-mail
.
- Follow-Ups:
- Re: Computing Chi-square stats on Card deck
- From: bkk . ngroup
- Re: Computing Chi-square stats on Card deck
- Prev by Date: Question about similarity of 'distributions'
- Next by Date: Re: Variance components analysis in random effects ANOVA with one factor
- Previous by thread: Question about similarity of 'distributions'
- Next by thread: Re: Computing Chi-square stats on Card deck
- Index(es):
Relevant Pages
|