Re: sparse polynomial arithmetic



On Apr 2, 3:47 am, Roman Pearce <rpear...@xxxxxxxxx> wrote:
On Apr 1, 4:45 pm, rjf <fate...@xxxxxxxxx> wrote:

It seems to be fast for polynomials with a billion terms.
Could you run tests with polynomials with, say, 20 terms, done 100
million times?

In that case it would make sense to run a different algorithm.  For
example, you could multiply out all of the terms and sort them in
memory, and add up coefficients in one pass.  For larger polynomials
you multiply blocks and merge them with a geobucket, using two dynamic
arrays for all temporary storage as described here:http://www.cecm.sfu.ca/~rpearcea/sdmp/sdmp_paper.pdf

We didn't optimize for this case, but it is interesting so here are
the times.  Maple cheats by remembering the result.  Pari has an
advantage because it uses a dense representation so there is no
sorting or monomial operations.  Intel Core2 Xeon 3.0 GHz, 64-bit,
with 16GB RAM:

Latency benchmark (dense)
f := (x+1)^20;
g := f+1;
multiply p := f*g; 10^6 times

# stan.cecm.sfu.ca              p := f*g
sdmp Apr 2008 monomial=1 word   30.940
Trip v0.99 double precision     14.024
Trip v0.99 rational numbers     41.561
Pari 2.3.3 (w/ GMP)             23.412  (dense)
Magma V2.14-7                   32.400
Singular 3-0-4                  60.940
Maple 11                         6.778  (remembered)

# sdmp input
f := sdmp(expand((x+1)^20), plex(x)):
g := sdmp(expand((x+1)^20+1), plex(x)):
infolevel[sdmp] := 0:
TIMER := time():
to 10^6 do
p := sdmp:-Multiply(f,g);
end do:
time()-TIMER;

# trip input
_modenum = NUMDBL;
f = (x+1)^20$
g = f+1$
time_s;
for n = 1 to 10^6 { p = f*g$}$
time_t;
_modenum = NUMRATMP;

# pari input
f = (x+1)^20;
g = f+1;
gettime();
for(n=1,10^6,p = f*g);
print(gettime());

# Magma input
P<x> := PolynomialRing(RationalField(),1);
f := (x+1)^20;
g := f+1;
TIMER := Cputime();
for i := 1 to 10^6 do
  p := f*g;
end for;
Cputime(TIMER);

# singular input
ring R=0,(x),lp;
poly f = (x+1)^20;
poly g = f+1;
poly p;
int TIMER = timer;
for (int i=0; i < 10^6; i=i+1) { p = f*g; }
timer-TIMER;

# Maple input
f := expand((x+1)^20):
g := f+1:
TIMER := time():
to 10^6 do
p := expand(f*g):
end do:
time()-TIMER;

Hi Roman,

since this is now about univariate polynomials I figure I try FLINT.

I ran the benchmarks on sage.math and to have an idea how the system
compares to the original I also run the Singular benchmark:

Singular: 135 [considering the time the resolution of the
clock wasn't worth fixing]
Sage+libSingular: 114.82
Sage+FLINT: 6.14

So, as you can see the interpreter overhead for Sage+libSingular is
twenty seconds smaller than Singular's interpreter. Pulling the
benchmarks down into Cython will shave off the interpreter overhead,
but that isn't something the casual user would do.

Using FLINT from Sage is less than elegant at the moment, but that
should change in the next months as a good interface is currently
being written. For the record: FLINT is univariate arithmetic only.

In detail:

Singular 3-0-4-1:

SINGULAR /
Development
A Computer Algebra System for Polynomial Computations / version
3-0-4
0<
by: G.-M. Greuel, G. Pfister, H. Schoenemann \ Nov 2007
FB Mathematik der Universitaet, D-67653 Kaiserslautern \
ring R=0,(x),lp;
poly f = (x+1)^20;
poly g = f+1;
poly p;
int TIMER = timer;
for (int i=0; i < 10^6; i=i+1) { p = f*g; }

timer-TIMER;
135
Auf Wiedersehen.

Sage 2.11 using libSingular:

sage: R.<x> = PolynomialRing(QQ,1); R
Multivariate Polynomial Ring in x over Rational Field
sage: f=(x+1)^20
sage: g=f+1
sage: def libsingular():
....: for i in xrange(10^6):
....: p=f*g
....:
sage: time libsingular()
CPU times: user 114.87 s, sys: 0.02 s, total: 114.88 s
Wall time: 114.82
sage:

Sage 2.11 using FLINT 1.0.6:

sage: from sage.libs.flint.fmpz_poly import Fmpz_poly
sage: one=Fmpz_poly([1])
sage: xp1=Fmpz_poly([1,1])
sage: f=xp1^20
sage: g=f+one
sage: def flinttimings():
....: for i in range(10^6):
....: p=f*g
....:
sage: time flinttimings()
CPU times: user 6.10 s, sys: 0.05 s, total: 6.15 s
Wall time: 6.14
sage:
.