Re: standard deviation, but without the mean
- From: "David A. Heiser" <daheiser@xxxxxxx>
- Date: Tue, 7 Mar 2006 19:05:36 -0800
"Ray Koopman" <koopman@xxxxxx> wrote in message
news:1141669275.577944.112830@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
richardstartz@xxxxxxxxxxx wrote:+++++++++++++++++++++++++++++++++++
The standard deviation is the square root of the variance (of course).
There's a standard formula for computing the variance from a running
sum. Suppose Xsum is the sum of the the first n numbes and that X2 is
the sum of the squares. Then
var = X2/n - (Xsum/n)^2
Just keep track if X2 and Xsum as you go.
Using running totals can sometimes lead to cancellation errors.
See http://tinyurl.com/d6ax2 for a stable algorithm.
Actually this is not a correct algorithm, just an approximation.
I've just gone through Welford's algorithm (a one pass calculation) using
xnumbers on the NIST data sets and have found Welford to obtain correct
values. The errors come in due to the general problem of summation of lists
of numbers, which no algorithm can fix. The solution of course is to do it
with as many digits as possible, so that the summation errors are not
important.
For example, running Welford's on the NIST NumAcc4 data set (1001 values)
using a 30 digit exact computation comes out to the theoretical value, with
the error in the least 9 digits (21 accurate digits).. By using Kahan's
method of summation, this reduces the error to about 7 digits.
If people would make the effort to read Knuth, a lot of the misconceptions
would disappear. Welford published in Technometrics, 1962, pg 419-420. All
this is in Knuth. Apparently everybody else missed it.
David Heiser
.
- Follow-Ups:
- Re: standard deviation, but without the mean
- From: Ray Koopman
- Re: standard deviation, but without the mean
- References:
- standard deviation, but without the mean
- From: Carlos Carreto
- Re: standard deviation, but without the mean
- From: richardstartz
- Re: standard deviation, but without the mean
- From: Ray Koopman
- standard deviation, but without the mean
- Prev by Date: Re: K-Nearest Neighbour Confusion
- Next by Date: How is this growth rate computed, given simple avg and std dev?
- Previous by thread: Re: standard deviation, but without the mean
- Next by thread: Re: standard deviation, but without the mean
- Index(es):
Relevant Pages
|