Re: ISO prime factorization algorithm for n <= 10^10



In article <874q92vbed.fsf@xxxxxxxxxxxxxxxxxxxx> Phil Carmody <thefatphil_demunged@xxxxxxxxxxx> writes:
> "*** T. Winter" <***.Winter@xxxxxx> writes:
....
> > Such things are extremely processor and compiler dependent. On a
> > CDC Cyber (which I used as reference) in general a loop that would
> > fit in 8 words of instructions would outperform a loop that would
> > fit in 9 words of instructions by a factor of about 60.
>
> Woh! I bow to your superior knowledge of once-relatively-common,
> now-bizarre architectures!

Perhaps, but at that time it was about the fastest machine available
where you could do such things.

> Depending on the instruction set,
> it's possible that the Rho inner would fit in 8 words of instructions.
> It really is tremendously simple.

I think not. The number of instruction to only calculate
x_n+1 = (x_n)^2 + a (mod n)
would already nearly kill it, I think. Because the machine had no
integer divide, and the integer multiply had only limited usability,
it is best to keep x_n, a and n in f-p. I think (yes, I tried it out)
I need 10 instructions to calculate (x_n)^2 + a. That is already 2.5
words of instruction, and that yields a double precision number.
I think that I need quite a bit more than 10 instructions to get the
(mod n) part. (However, off-hand I do not see an easy way to do it.)
Moreover we need two half-words for jumps (one forward if we detect a
x_n loop, one backward to the start of the loop if we do not find one).
So we have already over 6 words of instructions without actual x_n loop
testing and other bookkeeping...

There was a good reason that, when I helped Henri Cohen and Arjen Lenstra
with their first implementation of APR-CL, I implemented long integer
arithmetic with only 23 bits per word on a machine that had nominally
48 bit integers. And similar with a later implementation for the Cray-1.
On the other hand, I have implemented the same on some 40 other
architectures, and each one had its pecularities, and required special
care.

But my whole point was that it is architecture, compiler and implementation
method dependent what works best. And, in general, for smallish numbers
(like 15 digits), a simple method can very well out-perform a more
involved method. Another point to keep in mind is, how much time do you
spend on the implementation, what does it give you in computer time, how
many times will it actually be used?
--
*** t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~***/
.


Quantcast