Re: Multiple choice algorith problem

From: The Last Danish Pastry (clivet_at_gmail.com)
Date: 01/03/05


Date: Mon, 3 Jan 2005 13:00:19 -0000


"Peer K" <pk@webnetmail.dk> wrote in message
news:41d8079e$0$228$edfadb0f@dread11.news.tele.dk...

> The Last Danish Pastry wrote:
>
> >>Think of the candidates' and participants' answers as the
> >>co-ordinates of points in 30-dimensional space. The distance
> >>between two points/(answer sequences) is the square root of
> >>the sum of the differences of the -corresponding- individual
> >
> >
> > ... the sum of the squares of the differences ...
>
> Sorry to be a bit dense here but I don't get it. Can you elaborate a bit
> on this?
>
> I mean.. I'm looking to find a single number for each candidate that
> expresses the distance/difference to that of the useranswers. If I on
> each loop say 'distance = distance +
> sqrt(difference-between-user-and-candidate-answer). Won't I then end up
> with the same problem as in my first post which to summarize was that if
> candidate A answer 1 and 5, and the user answers 5 and 1 they will end
> up appearing to agree totally when in fact they disagree totally.
>
> (Sorry for any language-errors.. English is not my first language)

Some elaboration...

Suppose that, instead of there being 30 questions, there are just three.

Suppose that my answers to those questions are (1,2,3).

Suppose that, instead of there being 1000 candidates, there are just two -
called Abel and Baker.

Suppose Abel's answers are (4,2,3) and Baker's answers are (2,1,2).

We wish to find the candidate with views closest to mine.

By |x| we mean the absolute value of x.
  For example: |7|=7 |-3|=3 |0|=0.
By x^2 we mean x squared.
  For example: 7^2=49 (-3)^2=9 0^2=0.
By sqrt(x) we mean the square root of x.
  For example: sqrt(49)=7 sqrt(9)=3 sqrt(0)=0.

Method 1:
For each candidate we sum the absolute values of the differences between my
answers and the candidate's answers. We then choose the candidate with the
smallest sum.

Abel's sum = |1-4|+|2-2|+|3-3| = 3+0+0 = 3

Baker's sum = |1-2|+|2-1|+|3-2| = 1+1+1 = 3

So, it seems that I agree equally with Abel and Baker.

Method 2:
For each candidate we sum the squares of the differences between my answers
and the candidate's answers. We then take the square root of that sum. We
then choose the candidate with the smallest sum.

Abel's sum
= sqrt((1-4)^2+(2-2)^2+(3-3)^2)
= sqrt((-3)^2+0^2+0^2)
= sqrt(9+0+0)
= sqrt(9)
= 3

Baker's sum
= sqrt((1-2)^2+(2-1)^2+(3-2)^2)
= sqrt((-1)^2+1^2+1^2)
= sqrt(1+1+1)
= sqrt(3)
= 1.732...

So, it seems that I agree more with Baker than I do with Abel.

========

Method 2 seems better. I have three small disagreements with Baker but I
have one big disagreement with Abel. I think I am closer to Baker than I am
to Abel.

In this method each set of three choices becomes a single point in a three
dimensional space and we use Pythagoras's Theorem to find the candidate's
point which is closest to mine.

With 30 questions, rather than three, each set of 30 answers becomes a
single point in a 30-dimensional space, and Pythagoras's Theorem can still
be used.

-- 
Clive Tooth
http://www.clivetooth.dk


Relevant Pages

  • Re: Weighted Stdev
    ... I think Harlan forgot to divide by the sum of the frequencies before taking ... the square root. ... in one cell, say E2, put the formula for the weighted average, i.e. ... This replaces the reference to E3 with the ...
    (microsoft.public.excel.worksheet.functions)
  • Re: time series with binary data
    ... > I am reflecting the following problem: ... You seem to want to count intervals of "down-time" ... fairly natural number to take the square root of that sum ...
    (sci.stat.consult)
  • Re: Strange Program Behavior - Freecell on HP50g
    ... Given the random number seed as square root of 8, the sum of JH's ... differs depending if the calculator is in exact or approximate modes ... seed was defined in approximate mode. ...
    (comp.sys.hp48)
  • Re: FFTs of FFTs
    ... the DC term of the magnitude spectrum of the first fft is ... the square root of the square of the sum of the inputs to the first ... Depending on scaling, the DC term is the sum of the inputs, their arithmetic mean, or something between. ...
    (comp.dsp)