Re: Angles between vectors

From: Ross Clement (clemenr_at_wmin.ac.uk)
Date: 10/21/04


Date: 21 Oct 2004 09:56:31 -0700


"Aleks Jakulin" <a_jakulin@@hotmail.com> wrote in message news:<cl7tmt$3kn$1@planja.arnes.si>...
> Ross Clement wrote:
> > Hi. I'm interested in measuring angles between vectors. I'm
> > aware of the standard formula for this:
> >
> > A.B = ||A||.||B||.cos(t)
>
> It depends on what is your goal. If your goal is classification, you
> should pick such a distance measure that will help you classify
> better, e.g., http://www.stat.cmu.edu/~minka/papers/metric/ and the
> references therein. If your goal is description, you should pick a
> distance that captures the intuitive similarity best.
>
> You list several problems of metrics, and provide solutions for each
> of them. The 'Mahalonobis angle' metric will result if you transform
> the space in which A and B are vectors, and then apply your metric
> above. But deciding upon the transform comes under the "it depends" of
> my previous paragraph.

Hmmm..... I never seem to get away with not describing my application
on this group.

History: In the late 19th century and the first half of the 20th
century people working in literary analysis sometimes tried to come up
with a single numerical measure that described the style of an author.
Nobody seriously attempts to do this nowdays, but I was just *playing
around* with the idea. Please note my emphasis on playing around. I'm
not too serious on this one.

For explanatory purposes, assume that we have two measured properties
of texts (e.g. Herdan's Vm for vocabulary richness and, say, average
word length, in reality the vectors would contain a *lot* of
measures). Assume also that the data is transformed so that the
average of each of the two dimensions is 0. If we have 5 books by
author A and 5 books by author B then we might get the graph:

            ^
        B | B
            | A
         A | B
 <----------+----------->
      A | A
         A | B
            |
    B V

In this case the author B is distinctive from A in that typically the
distance of B's books from the origin is larger than for A. But, again
drawing a rather optimistic graph, it might look something like this:

             ^
        A |
     A |
         A |
<------------+------------>
             |
          B |
     B B |
             V

In which case, if we want to find a single number describing the
styles of authors A and B, the angle from the origin should certainly
be considered as a single number defining style. I.e. the angle (say)
45 degrees might describe A, and 315 degrees might describe B.

So, while I'm thinking of classification, I'm thinking about
classification after the vectors describing the books have been
reduced (by some method) to a single dimension. And no cheating like
making that a complex number or using bit patterns to include several
numbers. There are a lot of ways that vectors can be reduced to a
single dimension (distances from the origin, angle from the origin,
principle component, etc), and as a very vague conjecture, I thought
that angles from some origin might be the best method. Not that it's
likely to work, but I thought it would be interesting to think about
it.

Also, I'm curious if anyone has comments on the following. Let's say
that we have a such as the normal distribution defined by two
parameters. If I define two functions f1(x) and f2(x), then I can
define a subset of the set of possible normal distributions by N(
mu=f1(x), sigma=f2(x) ). So, even though the distributions are normal,
I can define my distribution with a single number x. If I have a lot
of complex data to describe then I could have more than one
distribution still described by a single parameter x. E.g. if I have
two properties to describe, and one (Y1) is best described by a normal
distribution and the other (Y2) by a poisson distribution, then I
could create three functions so that a single argument x parameterises
both these distributions, e.g.:

Y1=N(mu=f1(x), sigma=f2(x)), Y2=P(mu=f3(x))

If this was possible, then I could have a single number x, which
describes many features of the text of a books by distsributions. The
problem then of finding the single number x that describes the style
of the book would be to simultaneously search on the values of x and
the forms of the functions f1(x), f2(x), and f3(x). Possibly I could
use Expectation-Maximisation here.

Does anyone have any pointers towards research or techniques that
might be relevant here.

Cheers,

Ross-c



Relevant Pages

  • Re: Basic math Q: how do I make a number-generation function that is biased toward a specified n
    ... I have written a simple function that attempts to set the angle of objects so as to place them in aesthetically appealing ways. ... Ideally it could be parameterized such that you could choose a "gravity" setting. ... int rand1= rand.Next; ... Or, you could look up "Normal Distribution" in Wikipedia - good information there as well, wrapped in somewhat less intimidating math. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: OT raibow
    ... Looking at the picture of ray ... and a0 the angle at the vertex of the cone. ... conjecture corresponds to distribution with density concentrated at ...
    (comp.lang.perl.misc)
  • Re: Birefringence
    ... I have some questions about birefringence. ... the Internet and in a few books (it's tough to find books on ... and gives rise to conical refraction. ... is known but ne as you know varies with the angle between the o-ray ...
    (sci.physics)
  • Re: Lulu
    ... think they would still be prepared to pay for those 'books'. ... current model of distribution. ... Production is in fact related to distribution. ... Consequently many apps are frequently upgraded, ...
    (rec.arts.sf.composition)
  • Re: Submitting is easy once you stop thinking about it
    ... > to have sold over 150,000 copies of his debut novel. ... and if they have an author who's written nine books and routinely sells ... the press itself has distribution this good. ...
    (rec.arts.sf.composition)