averaging noisy data (was: Re: Spacecraft earth-flyby data reveals dynamical preferred frame)
- From: "Jonathan Thornburg [remove -animal to reply]" <jthorn@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 25 Aug 2009 13:32:39 +0000 (UTC)
Surfer <no@xxxxxxxx> wrote:
since the
filtered data contains much less noise than the raw data, the error
bars are now very much smaller.
In a moderator's note commenting on this, I wrote
[[Mod. note -- That's only true if the errors in different data points
were statistically independent. (There are a other requirements, too,
but independence is the most-often-violated one.) In contrast, if
(to take an extreme example) the errors were perfectly correlated,
i.e., every point were in error by the same amount, then the filtering
would have no effect on that error.
-- jt]]
I'd like to expand on this a bit, because this issue comes up a
lot in scientific research.
[I should add that nothing I say in this post is in any
way original. In fact, everything I say here is well-known
to large subsets of the scientific community. But I have
also seen prominent scientists apparently in ignorance of
some of what I say here, so I think it bears repeating.]
To explain this in a general context, let's consider a simple gedanken
problem: Suppose we're trying to measure an unknown real number x ,
and we have N measurements x_1, x_2, x_3, ..., x_N for some large N,
all made in the same way (presumably with the same error properties),
i.e., let's suppose that each x_i is randomly sampled from the same
(unknown) probability distribution. How can we combine these N
measurements to better estimate x ?
The obvious thing to do is to average our N measurements by defining
xbar := (x_1 + x_2 + x_3 + ... + x_N)/N
What can we say about the probability distribution of xbar?
We'd like to say that the variance of xbar's distribution is about
1/N times the variance of each x_i's distribution, i.e., we'd like
to say that by averaging N measurements, we've reduced our noise
(standard deviation) by about a factor of sqrt(N).
The problem is, sometimes that's not true. :(
In fact, not only do we not necessarily gain a factor of sqrt(N),
in general we may not gain anything: in general the variance of
xbar's distribution need not converge to zero in the limit
N --> infinity. :(
There are (at least) two quite different ways in which this can happen:
Type A: Each x_i is independently random-sampled from a "nasty"
probability distribution. For example, suppose each x_i is sampled
from a Cauchy distribution (a.k.a. a Lorentzian distribution), with
probability distribution P(y) proportional to 1/(1+y^2). Then it
turns out that the probability distribution of xbar is the *same*
Cauchy distribution! That is, if I average N Cauchy-distributed
quantities, the "noise level" in the average is the *same* as the
"noise level" in the individual quantities.
(If you've never seen this before, it's instructive to try it
numerically: write a quick-n-dirty computer program/spread*** to
compute N Cauchy-distributed numbers (you can get them by taking
the arctangent of uniformly-distributed numbers), and take their
average. Then repeat this many times to determine the empirical
sampling distrubtion of the average. Finally, plot a graph of the
variance of this empirical sampling distribution as a function of N.)
Type B: Each x_i may be random-sampled from a "nice" probability
distrubtion, but the different x_i are correlated, i.e., they're
not *independently* random-sampled. For example, suppose that for
some fixed K < N, x_1 through x_K are each randomly sampled from a
Gaussian distrbution, but then all future x_i are computed via
x_i = x_(i modulo K) . Then (since we only have k independent
observations) the variance of xbar is just 1/k times the variance
of an individual x_i, i.e., no matter how large N gets, averaging
N measurements only gains us a factor of sqrt(K) in noise reduction.
This case often comes up when an experiment is run for many days:
the (unwanted) effects of diurnal temperature variations often show
up in this way, with K being the number of measurements per day.
[Neither of these failures-to-reduce-noise contradicts the central
limit theorem (CLT), because in each of them we've violated one or
more of the CLT's assumptions. Notably, the CLT requires that each
x_i be *independently* randomly-sampled from a distribution with
*finite* variance (the Cauchy distribution's variance is infinite).]
The key lesson to learn from this simple gedanken problem is that
averaging experimental data, while a valuable part of the scientific
toolkit, is not a panacea. [In this context most filtering,
least-squares fitting, and (among others) Fourier-transform
operations behave the same way as averaging.]
In particular, some sorts of errors *don't* go away when we average.
Actual real-world experiments usually have a mixture of "nice" errors
(ones which do indeed go down by sqrt(N) when we average N measurements)
and no-so-nice ones (which don't). We often call the nice ones
"statistical" and the not-so-nice ones "systematic", but whatever
names they go by, real-world data usually has a mixture of both.
--
-- "Jonathan Thornburg [remove -animal to reply]" <jthorn@xxxxxxxxxxxxxxxxxxxxxxx>
Dept of Astronomy, Indiana University, Bloomington, Indiana, USA
"Washing one's hands of the conflict between the powerful and the
powerless means to side with the powerful, not to be neutral."
-- quote by Freire / poster by Oxfam
.
- References:
- Prev by Date: Bertlmann's socks and Einstein's other shoe
- Next by Date: Re: Spacecraft earth-flyby data reveals dynamical preferred frame
- Previous by thread: Re: Spacecraft earth-flyby data reveals dynamical preferred frame
- Next by thread: Miller's errorbars (was: Re: Spacecraft earth-flyby data reveals
- Index(es):
Loading