Bayesian Predictive Median Model Selection Explained by PI

From: Osher Doctorow (mdoctorow_at_comcast.net)
Date: 07/24/04


Date: Sat, 24 Jul 2004 15:25:18 +0000 (UTC)


 From Osher Doctorow mdoctorow@comcast.net

Maria Maddalena Barbieri (U. Roma) and James O. Berger (Duke U.) in
"Optimal predictive model selection," The Annals of Statistics 2004
Vol. 32 No. 3 870-897, also in arXiv:math.ST/0406464 v1 23 Jun 2004,
claim that it is often considered that the optimal future prediction
model is that with highest posterior probability in the Bayesian
approach, but that this isn't necessarily so for example for select-
ing among normal linear models for which they show that the optimal
model is the median probability model consisting of variables which
have posterior probability > = 1/2 of being in the model (which are
estimated by choosing the first time the probability estimate ex-
ceeds 1/2, etc.).

Probable Influence (PI) has a rather clear explanation for this,
although Barbieri and Berger (as usual) are unfamiliar with PI. It
may be difficult to believe this, but readers who recall that PI
probabilities are < .05 typically and that Very Frequent Event
probabilities are > .95 typically may recall also that Fairly
Frequent Event probabilities are between .05 and .95, which puts
their center value at .50.

To slightly recapitulate from past postings, PI probability is
near 0, and it takes over from (Bayesian) conditional probability
there because P(AB)/P(A) is undefined at P(A) = 0 and blows up
near P(A) = 0 (readers can try taking a one-sided limit as P(A)
--> 0+ (i.e., from the right). This is occasionally argued against
by theorists familiar with the Radon derivative, but they're wrong
because the proof of the Lebesgue-Radon-Nikodym Theorems of real
analysis hold only up to equivalence classes outside sets of
measure 0. See Hewitt and Stromberg's classic Real and Abstract
Analysis, Springer-Verlag: Berlin 1965.

So PI belongs to probabilities < .05, (Bayesian) conditional prob-
abilities belong to probabilities > .05 but with possible limita-
tions. It turns out that P. Hajek's Metamathematics of Fuzzy Logics,
Kluwer: Dordrecht 1998, although completely missing the probability
connections, has the three basic fuzzy multivalued logical implica-
tions (x-->y) = 1 + y - x, (x-->y) = y/x, and (x-->y) = y respective-
ly for Lukaciewicz/Rational Pavelka, Product/Goguen, and Godel
fuzzy multivalued logics (FMLs) for the non-trivial case, and with
various inequalities on these and the probability-statistics analogs
which I discovered, it's rather easy to show that Godel FML and
Independent Probability-Statistics (IPS) have the "natural" region
of probability > .95. Notice that inserting x = 1 into both other
types of FML yields (x-->y) = y, and likewise P(A-->B) = 1 + P(AB)
- P(A) and P(B/A) = P(AB)/P(A) for P(A) not 0 for P(A) = 1 yield
P(A-->B) = P(AB) = y where x = P(A), and even P'(A-->B) = 1 + P(B)
- P(A) yields P(B) = y for that case. So P(A) = 1 is the natural
region for independent probability-statistics, and just as with
P(A) = 0, a small interval on one side of it of length .05 gives the
result P(A) > .95.

Osher Doctorow