Re: how to compute distance metrics with multi dimensional data

From: Lou Pecora (pecora_at_anvil.nrl.navy.mil)
Date: 02/10/05


Date: Thu, 10 Feb 2005 12:09:18 -0500

In article <1108038987.525148.3280@l41g2000cwc.googlegroups.com>,
 "bluelagoon" <bluelagoontrading@hotmail.com> wrote:

> i got a set of time series data that is 1000 rows by 2 columns.
> column 1 contains amplutude data that corresponds! to column 2 duration
> data. ie
> a1,d1
> a2,d2
> ...
> a1000, d1000
>
> now, i embed both columns with dim=4 and delay=1, so i get
> a1,a2,a3,a4 d1,d2,d3,d4
> a2,a3,a4,a5 d2,d3,d4,d5
> a3,a4,a5,a6 d3,d4,d5,d6
> ...
> so in each row i have two sets if vectors A and D, each j element of A
> vector corresponds to j element of D vector!, i is the row, j is the
> column.
> A1 = (a1,a2,a3,a4) and D1 = (d1,d2,d3,d4)
>
> so we get
> A1,D1
> A2,D2
> ...
> A1000, D1000
>
> the question is:
> i need to compute the distance across the column, ie between A1,D1 and
> A2,D2 or A1,D1 and A3,D3 ( don't forget that each element of a1
> corresponds to d1 etc...) ?
> Euclidean, max norm and manhattan?
>
> basically it's embedded 2-d data.
>
> also how to generalized this 3-d and N-dim data.
>
> i would appreciate any help.
> thanks.

If your data is from sensors or other physical devices then it is not
clear that the amplitudes are meaningful unless you have carefully
calibrated each measurement. But you have to ask whether having the
time series in the units of measurement is important. Since you are
trying to get the distances you might have something in mind where you
can scale one time series to get it on a meaningful scale to compare to
the other time series. This requires knowledge of the system and
depends on what you are trying to compute. So, we need to know more
about what it is you are doing.

However, if you are trying to reconstruct an attractor from the time
series, then you probably want to "normalize" the time series by
demeaning each one rescaling each time series to its standard deviation.
Otherwise you end up comparing meters to joules or whatever in your
distances. The rescaling is really just like picking different units of
the same measurment type (e.g using feet rather than meters). It also
makes comparisons of distances to attractor size easier to think about.

-- Lou Pecora (my views are my own)



Relevant Pages

  • Re: how to compute distance metrics with multi dimensional data
    ... > time series in the units of measurement is important. ... > trying to get the distances you might have something in mind where ...
    (sci.nonlinear)
  • Re: how to compute distance metrics with multi dimensional data
    ... does not matter what it is, but the measurement units are different for ... paper on computing multi dimensional distances... ... suppose you have a time series with 3d ... vectors ie with 3d embedded time series? ...
    (sci.nonlinear)
  • Re: How to cluster time series data
    ... where columns corresponds to different time series and rows ... between time series as whole. ... calculating the distance between two rows it should ... calculate the distances between the time series and build ...
    (comp.soft-sys.matlab)
  • How to cluster time series data
    ... Is there a way to apply kmeans-function to time series data? ... where columns corresponds to different time series and rows ... calculate the distances between the time series and build ...
    (comp.soft-sys.matlab)
  • Re: strange behaviour of ntp peerstats entries.
    ... See especially the before and after time series and note ... filter is less than the Allan intercept, ... takes the measurement with the shortest delay in the past 8 measurements. ... This makes the smaller variance of chrony even more impressive, ...
    (comp.protocols.time.ntp)