Re: Best formula
- From: Virgil <vmhjr2@xxxxxxxxxxx>
- Date: Fri, 23 Jun 2006 13:26:34 -0600
In article <4g2i64F1jtrohU1@xxxxxxxxxxxxxx>,
José Carlos Santos <jcsantos@xxxxxxxx> wrote:
Hi all:
My guess is that this is a standard problem. What I'd like is to
have good references for its solution.
Suppose that there's a tournament of some sport and suppose that
you have certain data about the players that may give you some clues
about their performance. Suppose for instance that, for each player,
you have:
1) his or her position at the previous tournament;
2) whether he or she is a junior or a senior player;
3) a classification concerning his or her performance at tests just
before the tournament starts.
So, for the n-th player you have three numerical values, x_n, y_n, and
z_n. You want to have a formula of the type
v_n = a_n*x_n + b_n*y_n + c_n*z_n
so that the classifications of the players at the end of the tournament
gets as close as possible to their ordering using the values v_n, in the
sense that if the k-th player ends the tournament in the first place
then v_k is the highest v_n, if he ends in the second place, then v_k is
the second highest v_n, and so on.
So my problem is: how to get the coefficients a_n, b_n and c_m?
I suppose that this is a classical statistical problem. References,
anyone?
Best regards,
Jose Carlos Santos
The usual methods involve minimizing some measure of the total error of
prediction. The most common measures involve taking all the differences
between predicted values and actual values and either squaring them or
taking their absolute values (to eliminate positive errors and negative
errors from cancelling out when you add them) and then adding them up.
The result is called the "sum of squared errors" or the "sum of absolute
errors" as appropriate, and is to be minimized.
There are standard and fairly simple techniques for minimizing the sum
of squared errors for lots of situations. These processes are often
collectively called the method of "least squares", and can be searched
for under "least squares".
E,G.
http://mathworld.wolfram.com/LeastSquaresFitting.html
http://en.wikipedia.org/wiki/Least_squares
http://www.efunda.com/math/leastsquares/leastsquares.cfm
The methods of least absolute errors, while often more difficult to
implement, is considerably less sensitive to the effects of outlier
values in your data set.
There are many variations on both of the above methods as well as lots
of others, if your results for the above prove unsatisfactory.
Since your prediction formula, v_n = a_n*x_n + b_n*y_n + c_n*z_n,
is homogeneous and linear, least squares is a good place to start.
.
- Follow-Ups:
- Re: Best formula
- From: José Carlos Santos
- Re: Best formula
- References:
- Best formula
- From: José Carlos Santos
- Best formula
- Prev by Date: Re: An uncountable countable set
- Next by Date: Re: Simple Question, Don't Make Fun Please:)
- Previous by thread: Best formula
- Next by thread: Re: Best formula
- Index(es):
Relevant Pages
|