Re: Explanation of Maximum Entropy

From: Aleks Jakulin (a_jakulin_at_@hotmail.com)
Date: 08/29/04


Date: Sun, 29 Aug 2004 19:44:13 +0200


> Aleks Jakulin <a_jakulin@@hotmail.com> wrote:
> >
> >With the constraint of having two outcomes, MaxEnt would yield you
> >PMF of [0.5,0.5].

Radford Neal responded:
> Why? The data is 6 heads out of 10, not 5 out of 10. Shouldn't the
> best distribution pay at least a little attention to the data?

MaxEnt isn't very useful for the coin toss example. What would be a
non-trivial constraint derived from the data that would not determine
the PMF? All I can imagine MaxEnt to do in this case is

a) (maximum entropy prior) if you apply it to determining the
parameters of a Beta prior, it would prescribe you (any) _symmetric_
Beta prior (should your constraint be a Beta distribution with a
certain count). I've had rather bad results with Bernardo's reference
priors. Has anyone seen any practician seriously using those?

b) (maximum entropy model) [0.5,0.5] (in the absence of sensible
constraints) and

c) (maximum entropy posterior) of all the posterior PMF's, select the
one with the maximum entropy; So, under c) one wouldn't integrate over
the prior, and would not use the bold MAP

h' = argmax_h P(h|d)

but instead use the timid MEP as in

h' = argmax_h H(P(D|h)) where h is a particular hypothesis with a
non-zero posterior: P(h|d) > 0

here H(P(D|h)) = -Sum[ P(x) log P(x); x \in D ] (D is input space, x
is a point in D, d is the given input sample {x1,x2,...,xn})

For this to be non-trivial, the prior should be zero for some
parameter settings.

> I suspect you have in mind some sort of frequentist test that fails
> reject the null hypothesis that the distribution is [0.5,0.5], and
> since this distribution has maximum entropy, you go for it.

Not this time. :)

Aleks

-- 
mag. Aleks Jakulin
http://www.ailab.si/aleks/
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science,
University of Ljubljana,
Slovenia.


Relevant Pages