Re: Calculating Standard Deviation



Thanks you Gordon and Robert. I understand this better now. I will take
another look at it when I become more familiar with the subject of
least squares as a whole.

I'll check out the Google searches that you mentioned.

The Sunburned Surveyor

Gordon Sande wrote:
On 2006-03-20 17:30:59 -0400, sunburned.surveyor@xxxxxxxxx said:

Here is the first of my statistics/least squares questions...

When calculating the standard deviation for a set of values you use the
sum of all the residuals squared.

The usual technical definition. More general notions of spread are not
called standard deviations.

Is this done to remove the tendency for residuals that result from
random errors to sum to zero?

Usually fitting is designed to make the sum of the residuals zero.
Not just tend to but be exactly zero!

If this is correct, why is the sum of the absolute value of the
residuals not used?

It often is. Ask google about L1 norm fitting. Or Laplacian errors
instead of Gaussian errors. (Laplace is also called double exponential.)

I should not that the square root of the sum of the residuals squared
is not the same as the sum of the absolute value of the residuals. (So
the two different methods result in different values for the standard
deviation.)

The L2 (square root of sum of squares) and L1 (sum of absolute values)
norms are close but not the same. If you make the guess that there a whole
lot more norms of the L(whatever) sort you have been watching closely
and will go far. They often are called Lp norms as p goes from 1 to
infinity. Ask google about Minkowski norms but expect an awfull lot of
very abstract functional analysis to show up.

If there is another reason why the sum of the residuals squared is used
when computing the standard deviation, instead of the sum of the
absolute value of the residuals?

Mostly history. And ease of calculation and theory. Ask google about
robust statistics. And resistant statistics. Be prepared to see the
name of John W. Tukey come up repeatedly. L1 based fitting is becoming
more available but still lags in ease of theory and common acceptance.
Ask google for "The Gaussian Hare and the Laplacian Tortoise" for one
view.

Thanks,

The Sunburned Surveyor

.



Relevant Pages

  • Re: Calculating Standard Deviation
    ... When calculating the standard deviation for a set of values you use the ... sum of all the residuals squared. ... norms are close but not the same. ...
    (sci.math.num-analysis)
  • Re: lsqcurvefit-converge?
    ... > I have a problem considering lsqcurvefit as a routine to fit a number ... Are you truly looking at the sum of the residuals? ... The latter is what lsqcurvefit uses. ...
    (comp.soft-sys.matlab)
  • Re: lsqcurvefit-converge?
    ... >> supposed to be used to fit the data I have has 4 parameters. ... > Are you truly looking at the sum of the residuals? ... > The latter is what lsqcurvefit uses. ...
    (comp.soft-sys.matlab)
  • Re: August is tied for 5TH warmest in 125 years
    ... Roger Coppock wrote in message ... > The sum of the residuals is 12.337094 ... which means that it is WAY too early to assume an exponential fit. ...
    (sci.geo.meteorology)
  • Re: Regularized lsqr
    ... The first problem is that the sum of these norms ... CANNOT be written in a form that lsqr can solve. ... lsqr will minimize a sum of squares. ...
    (comp.soft-sys.matlab)

Quantcast