Re: Any tricks to get variance of continuous data?
- From: Lynn Kurtz <kurtzDELETE-THIS@xxxxxxx>
- Date: Thu, 26 May 2005 00:26:27 GMT
On Thu, 26 May 2005 00:48:38 +0100, Dave <nospam@xxxxxxxxxxx> wrote:
>I have an instrument measuring the frequency of an oscillator. The exact
>frequency is printed every second. Ideally it should be 10MHz, but in
>practice it is not exactly. I want to find the mean and variance - not
>of a set of data that has been collected in the past, but to update this
>every second as a result of the data collected in the last second.
>
>
>The first few data points (in Hz) are like this, and you can see I have
>computed the mean in the second column.
>
>
>9999999.934230 mean=999999.934230
>9999999.933640 mean=999999.933935
>9999999.934420 mean=999999.934097
>9999999.936180 mean=999999.934617
>9999999.936770 mean=999999.935048
>9999999.936770 mean=999999.935335
>9999999.935010 mean=999999.935289
>9999999.934620 mean=999999.935205
>
>Computing the mean is easy, with no need to store past results. I just
>keep a total of all previous results, and divide by the number of
>results. (In fact, I modify that a bit, so I am not having to find the
>mean of a lot of very large, but almost identical numbers. I actually
>subtract 10,000,000 first, to avoid handling large numbers.
>
>So for numerical accuracy I have something like the following in a loop,
>where a function 'new_data_point' computes the latest data.
>
>frequency=new_data_point();
>n=n+1
>sum=sum+(frequency-10000000)
>mean=10000000+sum/n
>
>Is there any way I can find the variance, without storing all past
>results? If I do this for about a week, there will be 600,000 data
>points. Whilst storing the data is not too much of a hassle, computing
>
>(x_n - mean)^2
>
>for all 600,000 values of n will be computationally expensive. Soon the
>data will be arriving faster than I can compute the SD.
>
>Are there any ways around this?
The sample variance is
s^2 = (1/n) sum[i=1..n] (x_i - u)^2
= (1/n) sum[i=1..n] (x_i^2 - 2 x_i u + u^2)
= (1/n) sum[i=1..n] x_i^2 - (1/n) 2u sum[i=1..n]( x_i ) + u^2
= (1/n) sum[i=1..n] (x_i^2) - 2 u^2 + u^2
= (1/n) sum[i=1..n] (x_i^2) - u^2
So you can just keep a running sum of the squares x_i^2 like you do
for the x_i.
--Lynn
.
- Follow-Ups:
- References:
- Prev by Date: Re: a question for the anti-Cantorians
- Next by Date: Re: 9 dots, 4 line segments problem
- Previous by thread: Any tricks to get variance of continuous data?
- Next by thread: Re: Any tricks to get variance of continuous data?
- Index(es):
Relevant Pages
|