Re: Mathematical formula to compare integer sequence



"Abstract Dissonance" <Abstract.Dissonance@xxxxxxxxxxx> wrote in message
news:12a5fg38k73sv99@xxxxxxxxxxxxxxxxxxxxx

"Dave Rusin" <rusin@xxxxxxxxxxxxxxxxxxxxx> wrote in message
news:e7sjng$7ue$1@xxxxxxxxxxxxxxxxxxxx
In article <12a343b588impc7@xxxxxxxxxxxxxxxxxx>,
Abstract Dissonance <Abstract.Dissonance@xxxxxxxxxxx> wrote:

I'm looking for a function on a subset of N that is "permutation
invariant".

e.g.,

let a = [1,4,9,2,1], b = [4,2,1,9,1], c = [4,5,9,2,1]

then f(a) = f(b) != f(c)

Two sequences of numbers can only be equivalent under a permutation if
they have the same cardinality n .


Ofcourse ;)

Two sequences of numbers of the same cardinality n are permutations
of each other iff the first n elementary symmetric functions of the
numbers are equal. That is, they must have the same sum, the same
sum-of-products-taken-two-at-a-time, ..., and the same product.
It is equivalent (in characteristic zero) to say that they have
the same sums of powers, that is, the two sums must be equal, the
two sums of the squares must be equal, ... and the sums of the n-th
powers must be equal.


I think I recall something similar when I took group theory... is that
where it comes from? Cyclotomic pops in my mind for some reason. I guess
I can go look in my book and see.

One more equivalence: the sequences are permutations of each other
iff the polynomials \prod ( X - a_i ) and \prod( X - b_i ) are
equal. You can compute these as the terms of the sequence go
streaming by, that is, you don't even have to have possession of
the whole sequence at any one time (nor indeed even know its length
when you start).

As a practical matter, these techniques can be difficult to use for
sequences of real numbers; for example the polynomial
(X-1)(X-2)...(X-20)
has integer coefficients but they are as large as 20! and so you may
not be able to store them exactly with a naive algorithm. Rounding each
to a 10-digit floating-point number gives another polynomial that
doesn't even have 20 real roots! (Maple finds real roots only near
1, 2, 3, 4, 5, and 6.8 ). Since you specified INTEGER sequences you
will have no roundoff error, but you will have to do something to
handle potentially large integers. It's probably easiest to compute
several values of the form \sum_i (a_i)^j mod p_k for
j = 1, 2, ..., n where the p_k range over a sufficiently large
set of smallish primes.

What makes for an optimal implementation will depend on things like
the sizes of n and the a_i, as well as a sense of the likelihood
that the sequences really will be permutations of each other, and
a sense of the ways in which they will "resemble" each other when
they are actually not simply rearrangements of each other.

You can if you wish combine the multiple things I have suggested you
compute into a single integer (that is, I am recommending a function
f which maps sequences to _sequences_, whereas you seem to want
f([a1, ..., a_n]) to be a _number_). Clearly such an approach is of
limited utility: if you want f(a) to be (1) an integer, (2) different
when a and b are not permutations of each other, and (3) small
enough to be worked easily, then it will only be possible to use f
on a small set of possible input sequences (e.g. 2^8 of them if f(a)
is to be a single byte).


I wasn't plan on using large sequences... n being < 20 which would be the
max(a_k) too. I was working with partitions of an integer and come up
with some algorithm to generate them but it turns out the algorithm
doesn't generate all the partitions.

e.g., suppose I want to partition 5,

5
4 + 1
3 + 2, 3 + 1 + 1
2 + 3, 2 + 2 + 1, 2 + 1 + 1 + 1
1 + 4, 1 + 3 + 1, 1 + 2 + 2, 1 + 2 + 1 + 1, 1 + 1 + 1 + 1 + 1


where basically you start from the second term and carry over a 1 from it
into the next term unless it would make the term larger which then it
carry over to the term after that.

So as you can see, in the above generation there are duplicates. Such as
1 + 4 = 4 + 1 and I wanted an easy way to remove them. Since I wrote a
program to do this I had the terms in arrays and I could just use the
built in sort method to sort them then compare term by term... but it led
me to think if there was some "mathematical" and "non-algorithmic" way.

Figuring that I could compute some type of "characteristic" for each
partition then just remove the duplicate entries with the same
characteristics.

What I was going to see is what kinda relationship there was between the
total number of partitions generated(including duplicates) by this
algorithm with the actual number of partitions(excluding duplicates).

I think I messed up the implementation of the algorithm because it didn't
generate all the partitions for n > 7. I think because I went the wrong
way when when looking for the largest element to carry from. I'll try to
fix it later and see what happens.

Thanks,
Jon


Instead of generating duplicates and then removing them, why not generate
just the canonical partitions in the first place? Let p(n,k) be the number
of partitions of n into parts of size at most k. Then the following
recurrence relation, which can be obtained by conditioning on whether a part
of size k appears, suggests an algorithm.

p(n,k) = p(n-k,k) + p(n,k-1)

You want p(n,n).


Rob Pratt


.



Relevant Pages

  • "Algorithmic Randomness, Quantum Physics, and Incompleteness"
    ... all finite sequences are to be found infinitely often and ... different members of the infinite set of random numbers. ... Just using random numbers with initial digits 1 thru 9, ... it generated by a shorter input algorithm than the length ...
    (sci.logic)
  • Re: compression - insights into infinite
    ... output sequences, your technology isn't reversible, plain and simple. ... For that, it needs an algorithm. ... Not very convenient for a compressor, ... he notice but he also remembers his promises, and I assure you, you ...
    (comp.compression)
  • Re: compression - insights into infinite
    ... output sequences, your technology isn't reversible, plain and simple. ... For that, it needs an algorithm. ... OUTPUT cell, residue, part A. The problem with this was that some ... Not very convenient for a compressor, ...
    (comp.compression)
  • Re: compression - insights into infinite
    ... output sequences, your technology isn't reversible, plain and simple. ... I'd suggest that maybe I'm measuring something erroneously or ... For that, it needs an algorithm. ... Not very convenient for a compressor, ...
    (comp.compression)
  • Re: Fasctode - Sort estimating complexity and B&V
    ... executed and measurest by all algorithm. ... execution time) for these sequences. ... be a Borland function, ... sort ever eneter in O. ...
    (borland.public.delphi.language.basm)