Re: Angles between vectors
From: Ross Clement (clemenr_at_wmin.ac.uk)
Date: 10/21/04
- Next message: Richard Ulrich: "Re: Binomial Random Variable with correlated trials?"
- Previous message: Qu: "Is it still Gaussian?"
- In reply to: Aleks Jakulin: "Re: Angles between vectors"
- Next in thread: David Jones: "Re: Angles between vectors"
- Reply: David Jones: "Re: Angles between vectors"
- Messages sorted by: [ date ] [ thread ]
Date: 21 Oct 2004 09:56:31 -0700
"Aleks Jakulin" <a_jakulin@@hotmail.com> wrote in message news:<cl7tmt$3kn$1@planja.arnes.si>...
> Ross Clement wrote:
> > Hi. I'm interested in measuring angles between vectors. I'm
> > aware of the standard formula for this:
> >
> > A.B = ||A||.||B||.cos(t)
>
> It depends on what is your goal. If your goal is classification, you
> should pick such a distance measure that will help you classify
> better, e.g., http://www.stat.cmu.edu/~minka/papers/metric/ and the
> references therein. If your goal is description, you should pick a
> distance that captures the intuitive similarity best.
>
> You list several problems of metrics, and provide solutions for each
> of them. The 'Mahalonobis angle' metric will result if you transform
> the space in which A and B are vectors, and then apply your metric
> above. But deciding upon the transform comes under the "it depends" of
> my previous paragraph.
Hmmm..... I never seem to get away with not describing my application
on this group.
History: In the late 19th century and the first half of the 20th
century people working in literary analysis sometimes tried to come up
with a single numerical measure that described the style of an author.
Nobody seriously attempts to do this nowdays, but I was just *playing
around* with the idea. Please note my emphasis on playing around. I'm
not too serious on this one.
For explanatory purposes, assume that we have two measured properties
of texts (e.g. Herdan's Vm for vocabulary richness and, say, average
word length, in reality the vectors would contain a *lot* of
measures). Assume also that the data is transformed so that the
average of each of the two dimensions is 0. If we have 5 books by
author A and 5 books by author B then we might get the graph:
^
B | B
| A
A | B
<----------+----------->
A | A
A | B
|
B V
In this case the author B is distinctive from A in that typically the
distance of B's books from the origin is larger than for A. But, again
drawing a rather optimistic graph, it might look something like this:
^
A |
A |
A |
<------------+------------>
|
B |
B B |
V
In which case, if we want to find a single number describing the
styles of authors A and B, the angle from the origin should certainly
be considered as a single number defining style. I.e. the angle (say)
45 degrees might describe A, and 315 degrees might describe B.
So, while I'm thinking of classification, I'm thinking about
classification after the vectors describing the books have been
reduced (by some method) to a single dimension. And no cheating like
making that a complex number or using bit patterns to include several
numbers. There are a lot of ways that vectors can be reduced to a
single dimension (distances from the origin, angle from the origin,
principle component, etc), and as a very vague conjecture, I thought
that angles from some origin might be the best method. Not that it's
likely to work, but I thought it would be interesting to think about
it.
Also, I'm curious if anyone has comments on the following. Let's say
that we have a such as the normal distribution defined by two
parameters. If I define two functions f1(x) and f2(x), then I can
define a subset of the set of possible normal distributions by N(
mu=f1(x), sigma=f2(x) ). So, even though the distributions are normal,
I can define my distribution with a single number x. If I have a lot
of complex data to describe then I could have more than one
distribution still described by a single parameter x. E.g. if I have
two properties to describe, and one (Y1) is best described by a normal
distribution and the other (Y2) by a poisson distribution, then I
could create three functions so that a single argument x parameterises
both these distributions, e.g.:
Y1=N(mu=f1(x), sigma=f2(x)), Y2=P(mu=f3(x))
If this was possible, then I could have a single number x, which
describes many features of the text of a books by distsributions. The
problem then of finding the single number x that describes the style
of the book would be to simultaneously search on the values of x and
the forms of the functions f1(x), f2(x), and f3(x). Possibly I could
use Expectation-Maximisation here.
Does anyone have any pointers towards research or techniques that
might be relevant here.
Cheers,
Ross-c
- Next message: Richard Ulrich: "Re: Binomial Random Variable with correlated trials?"
- Previous message: Qu: "Is it still Gaussian?"
- In reply to: Aleks Jakulin: "Re: Angles between vectors"
- Next in thread: David Jones: "Re: Angles between vectors"
- Reply: David Jones: "Re: Angles between vectors"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|