Re: how to compute distance metrics with multi dimensional data

From: Lou Pecora (pecora_at_anvil.nrl.navy.mil)
Date: 02/11/05


Date: Fri, 11 Feb 2005 14:57:32 -0500

In article <1108143670.835637.54270@z14g2000cwz.googlegroups.com>,
 "bluelagoon" <bluelagoontrading@hotmail.com> wrote:

> Lou,
> no, it does not seem right. ok, i'll make it step by step real simple
> i got 2d time series 1000 vectors, column one = cycle amplitude in
> points,
> colume two = cycle duration in seconds, points are not seconds, ie
> different measurement units
> 10, 8
> 11, 5
> 2, 3
> 8, 4
> 12, 1
> 4, 24
> 9, 14
> ....
> all the way till the 1000th row t = 1000
>
> now i embed with delay = 1 and dim = 4, in pairs! i'll show 4 rows as
> an example
> (10,8)(11,5)(2,3) (8,4)
> (11,5)(2,3) (8,4) (12,1)
> (2,3) (8,4) (12,1)(4,24)
> (8,4) (12,1)(4,24)(9,24)
> ....
> all the way till the 1000th row
>
> now, how do i compute the euclidean between the rows ??? considering
> that "points" are not "seconds"
> that's all i wanted to know.
>
> thanks.

We may be converging, but I think it is to what I originally suggested
-- what you called z-score (zero mean, std=1). You're working in an 8D
space. As you said you have to compare apples to apples. Once you do
this then the Euclidean metric works as usual ( a(t) and b(t) are the
two columns of demeaned, rescaled data):

    vector norm=sqrt[a^2(t)+ b^2(t)+ a^2(t+1)+ b^2(t+1)+
                     a^2(t+2)+ b^2(t+2)+ a^2(t+3)+ b^2(t+3)]

    distance between time sequential points (as below)=
      sqrt[(a(t)-a(t+1))^2+(b(t)-b(t+1))^2+
           (a(t+1)-a(t+2))^2+(b(t+1)-b(t+2))^2+
           (a(t+2)-a(t+3))^2+(b(t+2)-b(t+3))^2+
           (a(t+3)-a(t+4))^2+(b(t+3)-b(t+4))^2]

It generalizes to any dimension.

> +2
> ps.
> if we have 1d time series
> x1
> x2
> x3
> x4
> x5
> embedded with dim=4 and delay=1
> then we have
> x1,x2,x3,x4
> x2,x3,x4,x5
> ...
> typical formula for 1d euclidean
> between x1,x2,x3,x4 and x2,x3,x4,x5
> is = sqrt[ (x1-x2)^2+(x2-x3)^2+(x3-x4)^2+(x4-x5)^2 ]
>
> what i need is the formula for 2-d, 3-d and multi dimensional time
> series

-- Lou Pecora (my views are my own)


Quantcast