Re: Highest Posterior Density



OK Reefer. I am in a good mood, so rather than be
confrontational and enter into tedious and unenlightening
bickering about who said what, I will take a different
and more agreeable tack, and try to find a few
(unfortunately trivial) things on which we agree, and a
couple of (non-trivial) things on which we do not agree.

1) We agree that to find a normalized posterior density,
we need to perform an integration. If we did not agree on
this, there would be nothing more to talk about.

2) We agree that in order to maximize the posterior
density, we do not need this normalization constant.
Calculating it is often difficult, and if the OP is only
interested in maximization, it is counter-productive to
suggest that the OP calculate the normalizing constant.
This is what it seemed to me you were doing, based on
your statement 'You need to be able to supply the
Bayesian ingredients of a prior distribution for your
parameter, be able to perform the integration to obtain
your posterior distribution of your PARAMETER'. Perhaps I
misinterpreted the words 'You need to be able' here.

3) We agree that many interesting problems exist in high
dimensions, and that in high dimensions, the maximization
problem is far from trivial. Indeed, it is insoluble at
the present time, except in very specific cases.

4) I do not think we agree about what priors represent. I
think that the construction of priors in high dimensions
is a worthwhile exercise. However, it is difficult, which
perhaps explains your dissatisfaction with the results.
The problems are twofold. First, it is frequently unclear
how to construct a distribution that captures certain
types of prior information. This is why model building is
a subject of research. Second, there often has to be a
compromise between accuracy and tractability. Again, this
is a question of intelligent model building. In any case,
the OP presumably has a prior in mind, so this question
need not really detain us.

5) We seem not to agree on the relevance of post high
school mathematics to probability and statistics. I
repeat that I agree with your opinion of mathematistry,
but a Riemmanian metric and Lebesgue measure are scarcely
mathematistry. Regardless of the names you use for
various quantities, the fact remains that to create a
posterior density, you have to divide the posterior
probability distribution by another measure. What people
usually do is simply to drop the symbol 'd\theta'
(assuming \theta is the argument of the posterior), which
amounts to dividing by Lebesgue measure in the \theta
system of coordinates. My point is that this procedure,
that is, to drop the d'something' symbol from a
probability distribution, is not an invariant procedure,
a point that has been emphasized since Fisher.

Cheerfully yours,

illywhacker;

.



Relevant Pages

  • Re: different priors (flat, uniform, etc)
    ... A "prior distribution" is a Bayesian term, and means, basically, ... The posterior distribution is the likelihood function if the prior is ...
    (sci.stat.math)
  • Re: different priors (flat, uniform, etc)
    ... RF> The posterior distribution is the likelihood function if the prior ... Vinci Code of the conjugate prior beta for the binomial p. ...
    (sci.stat.math)
  • Bayesian statistics. Pareto as a (conjugate) prior.
    ... I am studying the Pareto(location_prior, ... distribution. ... The likelihood of the data is a beta ... I am having a hard time finding the exact posterior. ...
    (sci.stat.math)
  • Re: different priors (flat, uniform, etc)
    ... as is probably obvious should you look at his history ... John Uebersax PhD ... JU> So I think we should just dispense with the word "prior" ... RF> simply ASSUME anything has a normal distribution or any other ...
    (sci.stat.math)
  • Re: different priors (flat, uniform, etc)
    ... RF> The posterior distribution is the likelihood function if the prior ... Vinci Code of the conjugate prior beta for the binomial p. ...
    (sci.stat.math)