Re: SF: Back to theory
From: Nora Baron (norabaron_at_hotmail.com)
Date: 02/27/05
- Next message: Nora Baron: "Re: SF: Back to theory"
- Previous message: tamiry: "Re: Inverse Matrices over Z(P)(nxn)"
- In reply to: Tim Peters: "Re: SF: Back to theory"
- Next in thread: jstevh_at_msn.com: "Re: SF: Back to theory"
- Reply: jstevh_at_msn.com: "Re: SF: Back to theory"
- Reply: Proginoskes: "Re: SF: Back to theory"
- Messages sorted by: [ date ] [ thread ]
Date: 27 Feb 2005 06:21:53 -0800
Tim Peters wrote:
> [Nora Baron, to JSH]
> >>> You have made this so much more complicated than it needs to
> >>> be. Here is a simpler approach that may accomplish much the
> >>> same and it is also much more general.
> >>>
> >>> Assume M is the number to be factored. Pick a (small)
> >>> integer j. Let T = M - j. Thus T is a function of both
> >>> M and j.
> >>>
> >>> Factor T. Assume you have split it into two factors,
> >>> f and g. Thus T = f*g.
>
> [Tim Peters]
> >> As for James's methods, the test I'll talk about below tries all
> >> possible ways of splitting T=f*g s.t. f and g are integers >= 1
> >> and f >= g. I'm skipping f < g cases since gcd(f-g, whatever) =
> >> gcd(g-f, whatever) (someone may wish to sue me over this, but in
my
> >> universe gcd always returns a non-negative result).
>
> [Nora]
> > Rest assured that at least I will not be filing a lawsuit
> > over this!
>
> Wise choice: I have never been successfully sued for choosing a
convention
> by anyone with a palindromic pseudonym, and I intend to die with that
> flawless record intact.
>
I shall have to discuss this with my friend, Walt Law.
> >>> Now let X be some rational function of f and g. One
> >>> possible choice might be,
> >>>
> >>> X = (f - g)/(f + g).
> >>>
> >>> Finally, let Y = M / X. Thus M = X * Y.
> >>>
> >>> Note that X and Y are both functions of the factors of
> >>> T. Also both are rational numbers. There is some chance
> >>> that the numerator of X has a factor in common with M.
>
Of course as you note below, one may as well consider only
X = f - g in the first place. A probably-better variant is
discussed below ...
[snip]
> >> Of course I agree, but I'm almost as insignificant as you are
these
> >> days <wink>.
>
> > Hey, you're competing with a serial liar here!
>
> Sigh. Of all JSH's peculiar online behaviors, claiming that you lie
in your
> posts is the most incomprehensible to me. The "serial" part I can
> understand if I squint, but the "lying" part doesn't work for me even
after
> exhaustingly intense suspension of disbelief coupled with a heavy
dose of
> LSD.
>
As in many other instances he has his own nonstandard definition
of 'liar'. A 'liar' is someone who disagrees with James Harris.
[snip]
>> > Yes, good points, and I agree that there are advantages to
> > using M^2 - j^2. I was trying to make things as simple as
> > possible.
>
> I hope you to stick to that, too, at least until it's better
undertood. As
> was showed later in this msg, it was tweaking to try a much larger
number of
> splittings that gave a dramatic boost to your idea in testing, not
really
> complicating the base idea.
>
> ...
>
> >> Here's "the standard test" output after picking 1000 pairs of
15-bit
> >> primes at random (I gave a precise account of what that means in
an
> >> earlier post), and using this algorithm to factor their product:
>
> I want to elaborate on that, because some results _may_ be artifacts
of the
> way primes are chosen. I pick an N-bit prime like so: pick a random
> integer i uniformly from the range 2**(N-1) <= i < 2**N, and then
find the
> smallest prime >= i. But not all primes in range are equally likely
to get
> picked -- due to the way I'm choosing them, the probability of
picking a
> particular prime p is proportional to 1 + the number of composites
> immediately preceding p. I can't imagine why that would bias the
results,
> but can't swear that it doesn't. Another thing to test, anyway.
>
> ...
>
> >> "expected random" is the expected (mean) number of gcds that would
> >> have been needed to factor the same products by a method that
> > repeatedly picks k purely at random from 1..M-1 until
> >> 1 < gcd(k, M) < M.
>
> I want to elaborate on that too, because some results may be due to
errors
> in my analysis.
>
> I view picking gcd candidates at random as independent Bernoulli
trials.
> When M=pq, there are M/p-1=q-1 numbers in 1..M-1 divisible by p,
M/q-1=p-1
> divisible by q, and none divisible by p*q. So the probability of
picking a
> k at random in 1..M-1 that reveals a factor is x=(p+q-2)/(M-1).
>
> The number of trials until the first success then follows the
geometric
> distribution, and has mean 1/x = (M-1)/(p+q-2). I round that
expression to
> the nearest integer in the code. One possible insecurity is that the
> variance is (1-x)/x^2, and that grows without bound as M increases
(so that
> x approaches 0). This makes it hard to conclude anything from a
single
> trial, which is why I try to use 1000 (when there's enough time).
>
> ...
>
> >> I can slash the number of gcds needed, and speed it by a factor >
3,
> >> just by iterating over all the ways to split T^6 (instead of plain
T)
> >> as f*g. But that leads to a much bigger surprise!
> >> Like so:
> >>
> >> nbits 15: 1000 of 1000 factored, 100%
> >> gcds: actual 7,765,039 expected random 12,139,989 no factor 0
>
> > Hmm. That is a little surprising.
>
> It still is to me, and I'll give more evidence of better-than-random
> behavior later. Something still needs an explanation here!
>
> > The 100% factored part is NOT surprising. Here's why. Say
> > M = p * q, where p and q are prime. Let j = p - 1. Thus
> > T = M - p + 1. Hence let f = M - p + 1 and g = 1. Then
> > f - g = M - p + 1 - 1 = M - p, which factors as p*(q - 1), hence
> > gcd(f - g, M) = p.
>
> Right, and I realized your *un*tweaked method would always succeed.
It
> wasn't (still isn't ...) obvious to me whether it would still always
succeed
> after I tweaked it to look at splittings of T^6 instead of T, and the
> argument you just gave is for the untweaked case. It doesn't apply
directly
> to the T^6 case, and in fact picking j=p-1 doesn't always give a
factor in
> the T^6 case. For example, at M = 15 = 5*3, pick j = 5-1 = 4. Then
T=11.
> As you proved, when splitting over T, f=11 and g=1 does the trick:
gcd(10,
> 15) = 5. But there's no way to split 11^6 "that works": gcd(11^i -
> 11^(6-i), 15) = 15 for all i in 0..6. That's easy to see from a
small
> table:
>
> i 11^i mod 15
> -- -----------
> 0 1
> 1 11
> 2 1
> 3 11
> 4 1
> 5 11
> 6 1
>
> Since 11^i = 11^(6-i) mod 15 for i in 0..6, 15 divides f-g for all
> splittings of 11^6.
>
> > So can one do better with T = M^2 - j^2? Certainly one can do
> > no worse since M - 1 divides M^2 - 1.
>
I did some hand calculations which indicate that, with T = M - j
and T factored as f * g, letting X = f^2 - g^2 may be substantially
faster than X = f - g. Plus again it is easy to show that it
will always factor.
Here is an example:
M = 391
j = 1 --> T = 390 = 30 * 13 --> f = 30, g = 13
Thus f^2 - g^2 = 900 - 169 = = 731, and
GCD(731, M) = 17.
For small M, it seems to work (unreasonably) often with j = 1.
Not invariably, however = e.g., for M = 143, you have to go as far
as j = 3: T = 140 = 20 * 7 --> f^2 - g^2 = 351 = 13 * 27.
> Knock yourself out <wink>. I'm keener right now to account for the
> better-than-random behavior of what's already on the table. That's
not
> jaw-droppingly amazing, of course -- it's easy, e.g., to give a
factoring
> method for two-prime products that always succeeds with a single gcd
> (multiply all the primes through the square root of M, and do one gcd
with
> that). The thing that intrigues about _this_ method is that f-g
still just
> "looks like" poke-and-hope to me. If it's better than poke-and-hope,
there
> must be a reason for that.
>
Certainly your experiments look like a lot more than chance
deviations from poke-and-hope expected. Try X = f^2 - g^2 also,
if you don't mind.
> >> Congratulations! This is the first "surrogate factoring"
algorithm
> >> I've tested that performs better than random-gcd (well, it's still
> >> much slower than random-gcd, but time isn't the measure here). I
> >> suggest we christen it the Baron-Harris method <LOL>.
>
> > I was hoping to patent it. Oh well.
>
> Change the exponent to 7, and it's all yours <0.1 wink>.
>
> > Anyway, your using it with T^6 makes it Baron-Harris-Peters.
>
> No, sadly for me, James was the first to use exponent 6, so that's
his idea.
> I don't have enough imagination to try exponents other than 6 either
-- I
> have to leave breakthroughs like that to the real mathematicians.
>
> >> I don't have time to think about "why" now, but it sure was a
surprise
> >> to me -- all I had in _mind_ by moving to T^6 was to increase the
> >> quality of the pseudorandom number generator "f-g".
> >>
> >> Looks like that holds for products of 20-bit primes too (these
take a
> >> _lot_ longer to run, so I stopped after 75 -- out of time for
tonight):
> >>
> >> nbits 20: 75 of 75 factored, 100%
> >> gcds: actual 13,320,947 expected random 28,033,396 no factor
0
>
> I let "my usual" 1000-trial test run overnight, and the end result
was a
> little worse, but not much worse, than that:
>
> nbits 20: 1000 of 1000 factored, 100%
> gcds: actual 209,981,730 expected random 383,897,119 no factor
0
>
> Because of the very high variance of the "number of trials needed for
first
> success" statistic for a Bernoulli process with such low probability
of
> success per trial, I also tracked how many times the algorithm beat
the
> expected-random # of trials: 844 of 1000. That seems plain
impossible if
> it beats expected-random only by chance. In the worst case the
algorithm
> used about 800000 more gcds than expected-random, and in the best
case about
> 500000 fewer. The two median values of
>
> expected_random - actual
>
> were 226468 and 226618, and the mean was 173915. Since the mean
value of
> expected_random was 383897, that's a big difference. It all adds up
to
> strong evidence of better-than-random behavior to my eyes.
>
Yes - curious. No, wait, I mean "fascinating, wacky, cool".
> Just FYI, here's the distribution of j values (the first j at which a
factor
> was found); one "*" = 2 products with that j:
>
> 1 ******************************************************
> 2 ****************
> 3 ***********************
> 4 *************
> 5 *****************************
> 6 ****
> 7 ***********************************
> 8 *********
> 9 **********************
> 10 ******
> 11 *******************************
> 12 ****
> 13 *************************************
> 14 *****
> 15 ********
> 16 ***
> 17 ******************************
> 18 **
> 19 *******************
> 20 *****
> 21 ******
> 22 *****
> 23 ******************
> 24 *
> 25 **************
> 26 ***
> 27 ***
> 28 ****
> 29 ****************
> 31 *********
> 32 *
> 33 ***
> 34 **
> 35 **********
> 36 **
> 37 *****
> 38 ***
> 39 ****
> 40 *
> 41 ********
> 43 *****
> 44 **
> 45 *
> 46 *
> 47 ******
> 49 ***
> 50 *
> 51 *
> 52 *
> 53 ****
> 57 *
> 58 *
> 59 ***
> 61 **
> 62 *
> 63 *
> 64 *
> 65 **
> 67 **
> 71 *
> 73 *
> 83 *
> 85 *
> 87 *
> 95 *
> 127 *
>
Nice way to display the data.
> As with all the specific JSH algorithms tested too, odd j is luckier
than
> even. I think that's "just because" odd j forces T=M-j to have a
small
> factor (namely 2), and that in turn tends to increase the number of
> splittings to try.
>
[snip]
>
> > Amazing! And reassuring, regarding the sanity.
>
> Ah, you like those! OK, here are a thousand more. Heh -- kidding.
I'll
> just give the worst one from that run (in the sense of the first win
> appearing at the largest j):
>
> M = 774661 * 755087 = 584936450507
> j = 127
> T = M-j = = 2^2 5 599 997 48973
> T^6 = obvious
> f = 6629023236001399900589330376301953749216234168750
> g = 6042303984767482296320
> check for yourself that T^6 = f*g (it does)
> gcd(f-g, M) = 774661 bingo
> number of gcds tried = 1,178,023
>
> ...
>
> >> Not to mention that Baron-Harris probably always finds a factor,
>
> > See above.
Almost the identical argument works for X = f^2 - g^2.
>
> Ditto (there's no proof yet on the table that it always works when
splitting
> T^6, although it almost certainly does -- heck, a proof may even be
easy).
>
May be. I haven't looked.
> ...
>
> >> Yup. The other half is that if M _could_ be factored from a
> >> factorization of T efficiently, it's more than a little puzzling
that
> >> nobody yet has found a way to do that. Well, maybe tomorrow's
Baron-
> >> Harris-Decker-Filho method will finally crack that nut.
>
> > Baron-Harris-Peters-Decker-Filho. I'm not holding my breath.
>
> I will, but not before Decker and Filho commit to devoting their
lives to it
> too <wink>.
Conceivably we all have better things to do.
Nora B.
- Next message: Nora Baron: "Re: SF: Back to theory"
- Previous message: tamiry: "Re: Inverse Matrices over Z(P)(nxn)"
- In reply to: Tim Peters: "Re: SF: Back to theory"
- Next in thread: jstevh_at_msn.com: "Re: SF: Back to theory"
- Reply: jstevh_at_msn.com: "Re: SF: Back to theory"
- Reply: Proginoskes: "Re: SF: Back to theory"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|