More primality testing (was: Elementary group theory: Proof of Fermat-Maas ...)



From: "*** T. Winter" <***.Win...@xxxxxx>
- The chance that p-1 has a factor of trialDivisor is expected to
be 1/trialDivisor. Likewise the chance that p has a factor of
trialDivisor is expected to be 1/trialDivisor. Likewise the
chance that p+1 has a factor of trialDivisor is expected to be
1/trialDivisor. So the value you can expect to get by a single
trial division is 3/trialDivisor, 1.5 times as much as if you
were only testing p having a factor and collecting factors of p-1.
Eh?

Which part of that don't you understand? The first part seems
obvious to me. Pick a random number pm1 in some large interval such
as [1E120,1E121), and for any small particular integer trialDivisor
there's very nearly 1/trialDivisor chance that pm1 is divisible by
trialDivisor, right? The same argument applies to p=pm1+1 and
p+1=pm1+2. These three probabilities are early independent, and for
medium-size trialDivisor (more than a thousand) the compound
probability of more than one of the three divisibilities being true
is so small that you can simply add the three probabilities to get
the probability of any of them happening.

So what's the value of doing a single triple-trial-division
(compute remainder modulo pm1, check whether that remainder is 0 or
1 or 2)? Well if you happen to get a new factor of pm1 that you can
use in Fermat-Maas or other test, that's one unit of value. Or if
you happen to get a new factor of pm1+2 (p+1) that you can use in
some other test, that's one unit of value. And if you happen to
step on a factor of p (pm1+1) itself that's of value too.

Contarywise if you do a Fermat test and get a failure, that's of
value too.

I simply equated the values, so a single triple-trial-division if
successful is three times as valuable as a Fermat test that failed
hence succeeded at proving non-primality. That's giving trial
division an immense benefit. Actually finding that the number isn't
prime is much more valuable than simply accumulating one more known
factor of p-1 or p+1. So a single triple-trial-division on the
average is worth only a small bit more than 1/trialDivisor, not
3/trialDivisor as I so generously granted you.

Now most numbers are either prime (Fermat=1 every time) or normally
composite (Fermat=/=1 almost all the time), very very few are
Carmichael or otherwise very strongly pseudo-prime, so a single
random Fermat test is all you'd ever want to do. If the first
random-Fermat test shows you likely have a prime, the number
probably really is prime, and another twenty random-Fermat tests
won't have a Sahara-snowball's chance of telling you differently,
so they'd be a waste. So when your trial divisors have gotten so
large that the expected value of one more test divided by the cost
is less than the value/cost ratio for a Fermat test, then you
should stop doing trial divisions and do that one Fermat test
immediately. If that one Fermat test shows base**(p-1)=1, then you
have two options to proceed, gather more factors of p-1 and p+1
until you can run a efficient prime-proof method, or use that one
base to try to get too may roots of something thereby proving p is
not prime. Depending on how many small factors p-1 and p+1 have
that you can find cheaply, you may choose to do the too-many-roots
test before proceeding to look for really-hard-to-find factors of
p-1 and p+1. I haven't worked out the cost/benefit comparisons for
those paths yet.

- The whole purpose of this primality-test is to filter non-primes
out from randomly generated numbers.
Wrong. It is also used to gather information about small divisors of
p-1 and p+1. Moreover, this is a generic primality proving algorithm.
Whether the number is randomly generated or not has no influence.

You obviously didn't understand what I said there. Here's a
hierarchy of your task-goals:
-1- Produce a new random large prime for some purpose such as RSA.
-2- Pick a random large number, test whether it's prime or not,
repeat until you get a prime, i.e. filter primes from
non-primes among the stream of randomly-generated numbers.
-3- Conduct trial divisions, Fermat test(s), too-many-roots tests,
collection of small factors of p-1 ad p+1, and advaced
primality test which uses those small factors.
Alternately:
-1- Win some prize.
-2- Test one or published numbers as to whether they are prime or
not, i.e. filter primes from non-primes among the set of of
published numbers.
-3- Same as before.

There's no reason on Earth to perform -3- except in support of -2-.
The whole purpose, as I said, of little parts of -3-, such as
collecting factors of p-1 and p+1, is in support of -2-, proving
whether each p is prime or not, thereby filtering the primes from
the non-primes. I am not wrong there!!

If a number is composite,
you want to eliminate it as soon as you possibly can so you can
get on with testing other numbers. You don't want to waste
unnecessary time doing worthless trial division on a non-prime
when a single Fermat test would have shown it non-prime faster.
But you do not know whether a single Fermat test would show that.

That's right, which is why after a single Fermat test succeeds you
then proceed to a more definitive method. Given that random
non-prime numbers have very high chance of single Fermat test
showing non-prime, the remaining numbers (which show Fermat=1) are
now very very likely to be prime, so you can shift your strategy
toward one which is biassed toward proof of primality rather than
biassed toward stumbling upon a small factor as trial division
does.

And as you have use of the small divisors of both p-1 and p+1
later, it comes in pretty usefil.

Maybe you do, and maybe you don't. If both p-1 and p+1 actually do
have several small divisors, then you luck out. But if they don't,
all that effort was wasted. You're no closer to proving primality
or non-primality than Edison was toward a practical light bulb when
he eliminated one possible material for use as light-bulb filament.

Before you do that one Fermat test, the chance of such a test
showing conclusively that your number is not prime is basically the
density of composite numbers among odd numbers in the ballpark of
integers you're exploring. That chance is considerably greater than
50% for large numbers such as 120 digits. By comparison, the chance
a single trial division will tell you *anything* useful (other than
one more idea didn't work, one more filament burns out too
quickly), gets infinitesimally small as the trial divisor gets
arbitrarily large.

The program is not used to test whether the primes are suitable
in some system; it is used to prove whether a number is prime or
not.

That's what I said in most of my message, filtering primes from
non-primes out of a stream/set that is a mixture, however such
stream/set was generated doesn't matter.

It was specifically designed to prove whether some (large)
numbers in the Cunningham tables for which it was not known
whether they were composite or not, to give a proof of either.

Gabriel Cunningham who posted to sci.math recently, or somebody else?
According to Google Groups, the last use of the phrase
"Cunningham tables" prior to your use here was way back in mid-2004.
That apparently dealt with factors of cyclotomic numbers.
Is that what you're talking about? Are those tables accessible online?
Back in late 2003 you yourself cited them as "a list of factors of 2^n, etc.".
Is that correct?
In early 2000 there was a link to:
<http://homes.cerias.purdue.edu/~ssw/cun/index.html>
which dealt exclusively with a**n +/- 1 for a in [2,12] and n large odd.
Is that what you're talking about?
Those are special numbers which in general are easier to factor
than random numbers of similar size.

After you've done a single Fermat test with a random base in the
interval [2,n-2], and gotten it satisfied, you might as well
immediately proceed to the too-many-roots test and the
remaining-factor-of-(p-1)-primality test, in whichever order you
want.
And if the remaining-factor-of-(p-1)- is not prime?

Here's a neat thing I realized just a couple or so days ago (while
your article was in my queue for eventual response):
It's mostly the small factors of p-1 that are useful for finding
too many roots.

First of all, for a sound mathematical reason: The equation
base**q=c is expected to have q roots in a field, so you need to
find q+1 roots before you know for sure you're not in a field hence
the modulus isn't prime. So for large q, it's just a waste of time
to look for too many qth roots.

Secondly for a pragmatic reason I haven't fully explored: Some
small primes are useful and some aren't, while hardly ever is a
large prime useful, in generating base1**q - base2**q = 0 (mod p)
thereby giving gcd(base1-base2, p) = nontrivial factor.

So a partial factorization of p-1, just up to trial divisors of 100
or so, is quite enough to get nearly all the value out of the
too-many-roots method.

When doing my generation of a random prime of 120 digits, I
scanned over 294 candidates.

When you say "scanned", I assume mean classify, for example some
were obviously non-prime by one test or another, the very last one
was provably prime, and the rest fell into other categories, such
as probably prime but too hard to prove prime?

In 256 cases the number did not survive trial division, the
largest divisor found was 55717. The time needed for this one was
0.048 seconds. A pseudo-prime test uses about 0.046

By "pseudo-prime test" you mean a single random-base Fermat test?
Or something more complicated? Or something less complicated (such
as Fermat with fixed base such as 2, which I consider inferior to
random base)?

So you're saying for that one case that consumed a lot of time
doing trial division, it wouldn't have helped to do a Fermat test
instead, or whatever you mean by "pseudo-prime test", because a
"pseudo-prime test" would have taken essentially the same length of
time to achieve the same result, category=nonprime?

Also you're saying that for all the other cases where trial
division found non-primeness, that method was significantly faster
than a "pseudo-prime test" would have been, so you made the right
choice running trial division instead?

But in the cases where trial division did *not* discover
non-primeness, that's where you might have wasted time doing not
just the divisions to 55717 but probably quite a bit further before
you finally gave up that process? After all, you could't know at
the start of the trial division whether 55717 would have been
enough, or perhaps just a bit further would have been enough. It's
like a gambler betting a long-shot repeatedly. He has no idea when
he'll finally win, but any time he stops betting he fears that next
bet he would have made was the one that would have won the jackpot,
so he's afraid to ever stop gambling. Furthermore, for trial
division, the payoff gets worse and worse as you keep gambling.

You need to balance the 255 cases where trial division produced
non-prime result in much less time than a single "pseudo-prime
test" would have, against the other cases where trial division
*never* produced any definitive result despite considerably more
investment than a single "pseudo-prime test", where a single
"pseudo-prime test" would have cliched the matter already.

The complete set of primes upto 100,000 takes about 0.080
seconds.

Which is more than twice the cost of a single "pseudo-prime test".
How many candidates went all that way, to no avail, then
immediately tested non-prime when you finally switched to the
"pseudo-prime test"?

One candidate did also survive the pseudo prime test.

The one candidate you finally proved was prime, or another earlier
that you couldn't prove prime so you set aside, where the one you
really did prove prime doesn't count as a "candidate" in the above
sentence? Obviously the one you finally proved prime *did* pass the
"pseudo prime test". I'm just not sure whether that's the
"candidate" you are referring to or not in the above sentence.

In that case the remaining factor was:
28988632716339269019731982692336680940408646961327424290384644373248\
10095465363511659807695206555579326234594280889

Remaining factor of what??? You have these candidates for a prime.
You attempt lots of trial division on them. Not one of the trial
divisors works, so you think the number might really be prime.
There's no remaining factor involved. The whole number p is still
unresolved.

Or do you mean p-1 or p+1, which had lots of small factors pulled
out along the way, leaving one remaining factor each? That's two
(2) remaining factors, one for p-1 and another for p+1. So what
does the phrase "the remaining factor" mean???

I do know it is not prime,

Yeah, I checked it, it fails the random-Fermat test.

but I do not know factors at all.

You're talking about this being a factor of p-1 or p+1 ??
What is the p you're testing at this point?

If we look again at p-1, the remaining factor is
33243844858187235114371539784789771720652118074916770975211748134458\
83137001563660160329925695591260695223158579

OK, here you are clear you're talking about trial divisors of p-1,
and after pulling them all out (up to your limit of 100,000
presumably), this is what's left, an unresolved divisor of p-1?

again not prime.

Yeah, I checked that too, fails random-Fermat too.

We continue and we go through the series of composites:
13056258290074320601041371370980194690382577203250636625250077815748\
5002631433652508064171145062888252895419

29174470244018149817938387800452992529357285287780206074589748695273\
6045258211932067253940299

40338452724160620945672877478592658134116616501100509764506278218329\
6801574809618427

27373762871988263972343266933098100494795071228801949773764393453563\
69246777

80482074786671742403922454443758320516759348658171461568092745453237

Hmm, those are getting progressively shorter. I thought maybe you
had extended trial division to larger divisors, and those were the
unresolved quotients after each new divisor was pulled out. But I
checked, and none of those is a factor of the previous, in fact
each is relatively prime to its predecessor. So I have no idea
where you're getting those progressively smaller numbers from.

and finally the prime
1076366484602147092546975531532636822831532507598719595143807113

Indeed that passes random-Fermat test. But it's nowhere near 120
digits, so I don't see why you care about proving it's prime.

(setq p 1076366484602147092546975531532636822831532507598719595143807113)
(qf (+ -1 p) 100000)
(2 2 2 7 17 79 2341 5209 1173654736113133450139465346242456488234770875081581)
1173654736113133450139465346242456488234770875081581 fails Fermat test.
(qf (+ 1 p) 100000)
(2 3 3 11 37 146924172072365150497812657866862793179297366584591809328939)
146924172072365150497812657866862793179297366584591809328939 passes, probably prime!
(setq p 146924172072365150497812657866862793179297366584591809328939)
(qf (+ -1 p) 100000)
(2 593 123882101241454595697987063968687009426051742482792419333)
123882101241454595697987063968687009426051742482792419333 fails Fermat test.
(qf (+ 1 p) 100000)
(2 2 3 5 23 31 503 6827858100223583719646248635669052221467704599174091)
6827858100223583719646248635669052221467704599174091 fails Fermat test.

Summary, for p=1076366484602147092546975531532636822831532507598719595143807113,
we definitely don't have a complete factorization of p-1, but we
probably do have a complete factorization of p+1, but can't prove
it's complete so can't use it in proof that p is prime, and
together the two partial factorizations aren't enough to run the
super-dooper prime-proof, right? So I guess you just have to chuck
that likely-prime as unprovable per present technology (although a
massive effort might crack the factoring of one of those
Fermat-failers just as it cracked several RSA challenges, if
anybody cared enough to donate the CPU power).

But try to prove it prime. The original number is:
1234567890123456789012345678901234567890123456789012345678901234567890\
12345678901234567890123456789012345678901234500733

OK, now here you *are* starting from a fixed point and working
upward until you get a provable prime.
(setq p 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234500733)
(setq q 1076366484602147092546975531532636822831532507598719595143807113)
;Both pass Fermat test.
(gcd (+ -1 p) q) ;1
(gcd (+ 1 p) q) ;1
I have no idea how, starting from p1, doing trial division of p p-1
and p+1, you somehow end up with q.
(qf (+ -1 p) 100000)
(2 2 3 3 7 13 13
2898863271633926901973198269233668094040864696132742429038464437324810095465363511659807695206555579326234594280889)
;Last factor fails Fermat hence unresolved composite
(qf (+ 1 p) 100000)
(2 19 269 317 823 16567
2794319946897990101863451419666952955246101565076275439684540697130008099656241627832465078495975525457501)
;Last factor fails Fermat hence unresolved composite
(qf (+ -1 p) 1000000)
(2 2 3 3 7 13 13 148403
19533724194483446439581398416700929860183855421606991968076551264629489265482257849637862409833733680088910563)
;Spent a half minute extra, got one more factor out, but still unresolved.
(qf (+ 1 p) 1000000)
;Spent another half minute there, totally wasted, no new factor.


If no such nontrivial factor turns up, the you have
high confidence the number really is prime and doing a lot more
trial division to find ever more factors of p-1 and p+1 is almost
certain to be not a waste. But after a while it's probably better
to attack min((p-1)/allKnownSmallFactorsOf(p-1),
(p+1)/allKnownSmallFactorsOf(p+1))
with a quadradic sieve instead of more trial division.
The quadratic sieve is an extremely expensive way to prove a
number prime.

Your reading comprehension is awful! Let me translate to proof of
primality for dummies: You have a number p which passes both trial
division to some horrendous value and lots of random Fermat tests,
so you are really really sure it's prime. But you want to *prove*
it's prime using the Fermat-Maas test. So you need to factor p-1
completely. (or factor p+1 completely and use some other test that
works that way). So you have already pulled out lots of small
factors from both p-1 and p+1, leaving an unresolved composite of
each (that last quotient fails Fermat test so you *know* it can't
be prime).

So how do you break that last unresolved composite factor of p-1 or
p+1 into prime components? First, to make the problem as easy as
possible, you pick the smallest of the two, i.e. you pick
min((p-1)/allKnownSmallFactorsOf(p-1),
(p+1)/allKnownSmallFactorsOf(p+1))
OK fine, you have the smallest of those two unresolved known-composite
factors of p-1 or p+1. How do you completely factorize it? You already
know it's composite. This isn't a test for prime/composite. This
is a need to complete factorize that number. Answer: Quadradic sieve
(because I have it already programmed, I haven't done ellyptic curves
or general number field yet).

But the above large number takes only about 9 seconds to prove prime on
this fairly slow Sun. So why should I punt?

I didn't say to punt if you *can* prove it prime in such a short time.
I said to punt if neither p-1 nor p+1 can be resolved enough to
prove p prime in a reasonable amount of time.

So you and I
never accidently give customers the same primes where they might
know each other's secret key.
You should not give your customers two primes. You should give your
customer the product of two primes and two exponents, one of them
must be secret.

If you give your customers the product of two primes, and two
complementary primes, i.e. (a**r)**s =a mod p*q for all a, then
it's trivial to compute p and/or q, so you *have* in effect given
them both p and q.

If your customer leaks out the secret key, or computes p and/or q
and leaks that out, your customer is stupid.

If your customer asks for p*q such that p is the company's phone
number and several other well-known numbers, you should refuse to
take the order because that secret is sure to leak out, after which
the enemy will be able to try all combinations of such well-known
numbers to check which sequence of them was actually used.

If your customer asks for p*q such that public p*q is in fact those
public phone numbers, that's not a problem at all, providing that p
is random and q is then dependent on p but individually random if
you don't know p.
.


Loading