Re: Calculate the entropy using mu and sigma?
- From: "Michael" <mchlgibs@xxxxxxx>
- Date: 22 Mar 2007 11:13:44 -0700
No, you're right. The result I alluded to only holds for continuous
distributions (possibly only for differentiable ones). In other
words, a Gaussian is the maximum entropy *continuous* distribution
subject to a given mean and variance. My fault for being sloppy.
Thanks for pointing it out, Daniel.
Michael
Wikipedia has the same mistake here:
http://en.wikipedia.org/wiki/Maximum_entropy_probability_distribution
It's not just a "sin of omission," as it specifically states that it
is maximal for "all distributions on the real line." Plus, it comes
immediately after a definition of discrete distributions! I'd fix it
myself, but as I don't know what the proper class of functions is I
can't replace it with a correct statement. Anyone out there up to the
task?
It's shown using the calculus of variations using Lagrange
multipliers. I'm by no means an expert, but the calculus of
variations article states (at the bottom of the section "The Euler-
Lagrange equation") that the function f (in this case the probability
density function) "is required to have two continuous derivatives." I
think that's the correct statement, i.e., that among all distributions
with two continuous derivatives, the normal distribution is a maximal
entropy distribution, subject to given mean and variance.
In case you're interested, the derivation goes something like this:
Using the calculus of variations (link) and Lagrange multipliers
(link), maximize the entropy (integral(-inf, +inf) -f(x)log f(x))
subject to the constraints
integral(-inf,inf) f(x) dx = 1 (i.e., probability distribution is
well-formed)
integral(-inf,inf) xf(x) dx = A (mean)
integral(-inf,inf) x^2f(x) dx = B (variance -- actually, second
moment, but variance follows from it)
Using Lagrange multipliers, this is equivalent to maximizing
integral(-inf, +inf) -f(x) log f(x) + l1 f(x) + l2 x f(x) + l3 x^2
f(x) dx
= integral(-inf, +inf) L(x, f, f') dx
subject to the constraints, where l1, l2, l3 (or lambda1, etc.) are
unknown constants.
The calculus of variations proceeds by pretending f(x) = f is
independent of x, and f' is independent of f and of x, and using the
Euler-Lagrange equation
- d/dx pd L/pd f' + pd L/pd f = 0
(where pd = partial derivative; sorry for the typography)
Working this out:
[ - d/dx 0 ] + [- (f * 1/f + log f) - l1 - l2*x - l3*x^2] = 0
-log f - (l1 - 1) - l2*x - l3*x^2 = 0
f = exp(- (l1 - 1) - l2*x - l3*x^2)
And working out l1, l2 and l3 from the constraints gives a Normal
distribution with the given mean and variance. QED.
Michael
P.S. The derivations for uniform distribution and for exponential
follow the same logic, just different constraints and different limits
of integration.
P.P.S. Daniel, please feel free to add the appropriate caveat to the
Wikipedia article. Unlike Weird Al, I have never editted Wikipedia,
and actually don't know how.
.
- References:
- Can we deduce the histogram from mean and variance!?
- From: m7ossny
- Re: Can we deduce the histogram from mean and variance!?
- From: user923005
- Calculate the entropy using mu and sigma?
- From: m7ossny
- Re: Calculate the entropy using mu and sigma?
- From: Michael
- Re: Calculate the entropy using mu and sigma?
- From: Daniel McLaury
- Re: Calculate the entropy using mu and sigma?
- From: Michael
- Re: Calculate the entropy using mu and sigma?
- From: Daniel McLaury
- Can we deduce the histogram from mean and variance!?
- Prev by Date: Re: Proving the four color theorem
- Next by Date: Re: The Collatz discrete primes!
- Previous by thread: Re: Calculate the entropy using mu and sigma?
- Next by thread: Re: Can we deduce the histogram from mean and variance!?
- Index(es):
Relevant Pages
|