Re: Testing For Randomness



On Apr 17, 10:29 pm, "mensana...@xxxxxxxxxxx" <mensana...@xxxxxxx>
wrote:
On Apr 17, 6:13?pm, bill <b92...@xxxxxxxxx> wrote:



On Apr 11, 12:18 am, "mensana...@xxxxxxxxxxx" <mensana...@xxxxxxx>
wrote:

On Apr 11, 2:30?am, "Phil Holman" <piholmanc@yourservice> wrote:

<mensana...@xxxxxxx> wrote in message

news:1176255187.262642.40400@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

On Apr 10, 9:19 pm, "Phil Holman" <piholmanc@yourservice> wrote:
"Phil Holman" <piholmanc@yourservice> wrote in message

news:4MCdnTG_cZy5vYHbnZ2dnUVZ_uiknZ2d@xxxxxxxxxxxxxx

"bill" <b92...@xxxxxxxxx> wrote in message
news:1176235254.684068.299190@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Apr 10, 10:27 am, Bart Goddard <goddar...@xxxxxxxxxxxx> wrote:
b92...@xxxxxxxxx wrote:
My spread*** generates supposedlyrandomnumbers between O
and 1. ? What are the best tests to determine just howrandom
these
numbers are?

http://www.random.org/analysis/

--
The man without a .sig

Are there say 5 or 6 good tests that I can run without
using specialized software.

If I am counting runs of 4 1's; how do I count "111111";
as 3 runs of 4 or as only 1 run?

Thanks,

---Bill J

Why wouldn't you count it as 1 run of 6?

Here are three good tests. ?1/ a test of significance for a
proportion
with an expected value of 50% 1s and 50% 0s. The higher the sample
number, the smaller the confidence interval. 2/ a test of
significance
for a proportion using position with an expected 50% of 1s being in
odd positions and 50% even. 3/ eyeball it and look for a pattern.

1/ ? ?.50 +/- 1.98* sqrt(.5^2/n) will give you a 95% confidence
interval
2/ ? ?use the same equation above for odd or even position
3/ ? ?look for recurring patterns or long strings of 1s or 0s

Example: for a thousand 0 or 1s, the 95% confidence interval would
be
.5 +/- 1.98*sqrt(.25/1000)
= .50 +/- .0313
this would be an interval from 469 to 531. You would expect the
same
95% confidence interval for odd and even positions of 1s.

Just thought of another good test. A chi-square goodness of fit for
the
number of consecutive 1s or 0s.
Chi-square = Summation [(O-E)^2/E].........O are the observed values
from your sample and E are the expected values.

The difficult part of this is figuring the expected values for the
different numbers of consecutive 1s or 0s.

Isn't the distribution of consecutive 1s and 0s a
geometric distribtution (twice as many groups of 1 as
groups of 2, twice as many groups of 2 as groups of 3,
twice as many groups of 3 as groups of 4, etc.)?

Isn't the expected number of trials to achieve m success
1/p, so if the probability of getting a run of 10 1s is
1024, then a block of 1024 should contain at most one run
of 10?

Can't you work that into your quickie randomness test?

one run of 10
2 runs of 9
4 runs of 8
8 runs of 7
etc
This would sum to a total much larger than 1024. It's a little more
complicated than this.

I must be mis-remembering something about that part.

Phil H

There are two possible ways of counting runs! ?Consider
a run of ?k ones.
1. This an be counted as ?one run of 'k' and ?k-1) runs
of 'zero'.
2. ?Or it can be counted as a run of k, a run of (k-1),
a run of (k-2), ... a run of 3, a run of 2 and
a run of 1.

For case 1, there should be N / (2*2^k) runs of k

For case 2, ?there should be N / (2*k) runs of k

For my work on the Collatz Conjecture, I consider it a
single run of length k. Nothing is random in Collatz, but
all I care about is whether it LOOKS random. By that I
mean does the contiguous 0 bits at the least significant
end have a geometric distribution? If yes, then I know
that there will be a mean of 2 iterations of n/2 for every
1 iteration since the mean of a geometric distribution
is the inverse of the probability.

So, if I were to create a 53328 digit number by randomly
selecting the digits

import collatz_functions as cf
import random
s = str(random.randint(1,9))
for i in xrange(53327):

s = s + str(random.randint(0,9))

and then running them through Collatz, I get

b = cf.gmpy.mpz(s)
b_stats = cf.collatz(b,0)
b_stats
[848977, 423875]
print 848977.0/423875

2.00289472132

just what I expect, twice as many even steps as odd steps.

But what if the number wasn't randomly chosen, say, the
1st 6th Generation Type [1,2] Mersenne Hailstone (which
also has 53328 digits)?

a = cf.Type12MH(6,1)
a_stat = cf.collatz(a,0)
a_stat
[1531812, 854697]
1531812.0/854697

1.7922281229488346

Here the ratio was much less than 2, so 'a' failed the
"randomness" test. As it should, for although it may look
random in decimal, it doesn't in binary - it's a Mersenne
Number (all 1's in binary, 177149 of them to be specific).
There is a geometric component to the Collatz run, but
only after a run of 177149 alternating odd/even numbers.

So considering a single run of length k allows me to "test
for randomness" (not that it is random, just whether or
not it behaves as if it were).



---Bill J

My sample size is 8000 bits. For each sample, I record the number
of runs of each length for 0 < k < 17.
How many samples should I take and what is good run lentgh
to analyze?

---Bill J

.


Quantcast