Re: Naive question about a difficult problem? N-dimensional analysis



On Wed, 17 Oct 2007 00:47:16 -0000, HAL 9000 <rubyhacker@xxxxxxxxx>
wrote:

FYI, my math background is limited to discrete math, linear algebra,
four semesters of calculus, and one of differential equations.

So excuse me if I use improper terminology or make obvious mistakes.

I have an idea that is a little hard to explain.

I have a large set of objects or entities -- up to a million, let's
say.
Let's call them A, B, C, ...

There is a concept of a "distance" between any two of these -- this is
not originally spatial however. That is, by doing a lot of
computation, I
can come up with the "distance" AB between A and B. (Think of it as
like a correlation coefficient.)

Well, if it's a type of _distance_, then first make sure the metric
axioms are satisfied:

d(A,A) = 0

d(A,B) = d(B,A)

d(A,B) + d(B,C) >= d(A,C)

Knowing the entire set of distances, I am wondering if it is possible
somehow to "back-construct" a set of coordinates in N dimensions
for each of the original entities.

No, not always.

This would turn the distance formula into a simple Pythagorean
distance --
do you see where I am going?

Yes. It's a worthwhile objective -- unfortunately, probably
unachievable.

When new objects were added to the space, I could calculate their
new coordinates based on a small sampling of existing points.
And all distances could be calculated simply instead of painfully.

No pain, no gain.

This raises several other questions in my mind.

1. My intuition tells me that some data sets require more dimensions
that others.

Yes, definitely.

If the distances were generated from a finite set of points in some
R^n, recovering a set of generating points is doable, but not trivial.
The recovered generating set will not uniquely determined, although in
some cases it will be unique up to an isometry of the containing
Euclidean space.

As a crude example: Distances [AB=3,BC=4,CA=5] can be
represented in only two dimensions

Sure, it's a 3-4-5 right triangle.

-- but [AB=3,BC=5,CA=5] cannot --

Why not?

The triangle inequality is satisfied, so there exists a 3-5-5
triangle (which can therefore be placed in R^2).

suppose it can be represented in three though??? Not certain.

An example of a set of points requiring 3 dimensions would be 4
points, A,B,C,D such that the distance between every pair of distinct
points is 1.

2. Are there some data sets that simply can't be represented?

Yes.

There does not exist a positive integer n, and 4 points A,B,C,D in R^n
such that

d(A,B) = d(B,C) = d(C,A) = 20

d(A,D) = d(B,D) = d(C,D) = 11

Is there a way to tell?

Yes, but it's not so easy.

3. Is there a way to compute the minimum number of dimensions
required?

Yes, provided it can be done, but as mentioned above it's not so easy.

4. Is there a way to reduce the number of dimensions by "tweaking" the
distances? (These are probably only accurate to 5% or so anyway.)

Sometimes, but usually not.

5. Is all this just Too Damn Hard? One of those problems like
Traveling Salesman that seems simple for very small numbers,
but is quickly intractable?

Well, it is hard, but not undoable.

The main issue for your actual application is that it's highly
unlikely that any large sample will embed in any R^n (unless in some
hidden way, the points were really derived from points in some R^n).
Hence, since in practice, for your data, the embedding will probably
never be possible, it's not worth the effort implementing an algorithm
to test for it.

quasi
.



Relevant Pages

  • Re: Naive question about a difficult problem? N-dimensional analysis
    ... four semesters of calculus, ... Well, if it's a type of _distance_, then first make sure the metric ... My intuition tells me that some data sets require more dimensions ... it's a 3-4-5 right triangle. ...
    (sci.math)
  • Naive question about a difficult problem? N-dimensional analysis
    ... FYI, my math background is limited to discrete math, linear algebra, ... four semesters of calculus, ... This would turn the distance formula into a simple Pythagorean ... My intuition tells me that some data sets require more dimensions ...
    (sci.math)
  • Re: Questions about diamond distance.
    ... diamond "declared" distance? ... DIAMOND DISTANCE a distance flight of at least 500 kilometres. ... just one-up the 300K and do a 500K triangle or out-and-return. ...
    (rec.aviation.soaring)
  • Re: Euclids postulates and non-Euclidean geometry
    ... A plane can easily move in 3 dimensions. ... on my statement about the shortest distance is not a geodesic at all. ...
    (sci.physics)
  • Re: "The Paradox of Zeno"
    ... The calculus only enabled naive realists to convince themselves they could ... target at a distance, L, and the time of flight is divided into intervals. ... Before you can travel some fraction of the distance you must first travel ...
    (sci.physics.relativity)

Loading