Re: Sum of k-th powers, algorithmically

From: David Wagner (daw_at_taverner.cs.berkeley.edu)
Date: 09/21/04


Date: Tue, 21 Sep 2004 00:59:52 +0000 (UTC)

David Wagner wrote:
>Given integers n,k>0, write n as a sum of k-th powers. How quickly
>can this be done? Does anyone know of any efficient algorithms?
>I'm particularly curious whether it can be done in time polynomial in
>log n and k.

After thinking about this some more (inspired by some other posts here),
it occurred to me that my application allows me to generalize the problem
a bit. The revised problem: Given m,k>0, find a non-trivial solution to
  a_1^k + ... + a_j^k = 1 (mod m)
for a_1, ..., a_j. (A solution is trivial if it holds not only modulo m
but also in the integers.) This easier problem suffices for my application.
(Note: a term x^k can be repeated multiple times in the sum, but you are
charged for the time it takes to write out each occurrence of this term.
Hence j cannot be too large for any efficient algorithm.)

Any ideas on how to solve this efficiently? Can it be done in
poly(log m, k) time? Does working modulo m make the problem any easier?
Is there any relevant literature I ought to take a look at?

If anyone is interested, here is one approach that I've been playing
with. If f(X) is a polynomial, define the "discrete derivative"
  f'(X) = f(X+1) - f(X).
The iterated discrete derivate f^{(n)}(X) is obtained by applying this
operation n times. Assume gcd(k!, m) = 1, and let f(X) = X^k. Then
  f^{(k-1)}(X) = k! x + C(k,2) (k-1)!.
Note that the left-hand side is a linear combination of the k terms
f(X+k-1), .., f(X+1), f(X), say
  f^{(k-1)}(X) = c_{k-1} f(X+k-1) + ... + c_0 f(X).
Since this equation holds over the integers, it holds modulo m as well.
Since k! is relatively prime to m, let x = 1/k! + (k-1)/2 (mod m).
Note k! x + C(k,2) (k-1)! = 1 (mod m). Substituting X=x then yields
  c_{k-1} f(x+k-1) + ... + c_0 f(x) = 1 (mod m).
Note that this is a sum of k-th powers; counting multiplicities, the
sum has c_{k-1} + ... + c_0 terms, and 0 <= c_{k-1} + ... + c_0 < 2^{k^2},
and the coefficients can be computed in O(k^2) time. So this is almost
a poly(log m, k) time algorithm -- except that the coefficients are far
too large, so this doesn't actually work as it stands. Still, does this
give anyone any ideas?