Re: The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- From: "S. F. Thomas" <thomas7243@xxxxxxxxxxxxx>
- Date: Thu, 29 May 2008 13:26:32 -0400
illywhacker wrote:
On May 29, 4:25 am, "S. F. Thomas" <thomas7...@xxxxxxxxxxxxx> wrote:illywhacker wrote:On May 28, 8:52 pm, "S. F. Thomas" <thomas7...@xxxxxxxxxxxxx> wrote:illywhacker wrote:
(( cuts ))
Again, I am most certainly no defender of the prevailing (classical)
orthodoxy. However, the sense which remains after having read this is
that there seems always to be a way to rectify the failings in the
orthodox approach, whether obviously so, or not.
Sure. But *if* it can be rectified, then once it has been rectified as
much as possible, it turns out to be equivalent to a Bayesian approach
(Jaynes has other examples of this kind of thing, plus see below for
decision theory). The difference is that it is completely ad hoc,
takes much more work, and is enormously less intuitive. (The common
temptation to interpret confidence intervals as if they concern the
probability that the parameter falls in a certain range shows this.)
The real problem, though, is that nothing in the classical approaches
lets you *know* that things are going wrong: you simply get a bad
answer. Now, in cases where the problem is obvious, as deliberately
chosen by Jaynes, the result will merely be that the user abandons the
statistical approach for that example, and perhaps for evermore. But
in cases where it is not obvious, bad things might and probably have
happened. If they haven't, this is a tribute to common sense, not
classical statistics.
Indeed. But as I said, I am no defender of the classical approach. I dislike it for the same practical reasons that you dislike it. However, I have to concede, so far, that some sufficiently clever classicist may rectify whenever a Jaynes comes along to point out clear error.
But I'm more concerned with the issue of principle. I thought perhaps
there was a demonstration of an unrectifiable failure of principle in
the orthodox structure, which I don't think there was. Correct me though
if I missed something.
There is a failure of principle, indeed many. One is called Cox's
theorem. It shows that probability theory (i.e. sum and product laws)
is the unique generalization of classical logic to cases of uncertain
knowledge that satisfies a number of incontestable axioms of
rationality.
How does this defeat the classical approach? I see this as a device to rationalize a Bayesian treatment of uncertainty as it applies to characterizing what an unknown (constant) parameter might be. Thus it is at best a shield, not a sword.
If you wish to reason about uncertain propositions, you
have no choice but to use probability theory: it has nothing to do
with 'randomness'.
Er... as a matter of category, there is at least one other kind of uncertainty, namely uncertainty of the fuzzy or possibilistic kind.
Since clasical methods ignore this, they ignore a
key point of principle.
That may at best be an argument *for* your point of view, not *against* the classical.
Indeed the very use of the meaningless word
'randomness' in anything other than a strictly mathematical context
('random variable') is another failure of principle.
Again, at best an argument *for* your particular paradigm, not *against* any other. I don't agree that the concept of randomness is meaningless, but I do agree that its definition requires some care if one is not to get ensnared in a trap of circularity.
Another is
example concerns decision theory. Wald studied decision theory from a
classical perspective: he was in no sense a Bayesian. After an
enormous amount of work, he arrived at the conclusion that the only
reasonable (in a well-defined sense) decision strategies were Bayesian
ones. Etc.
Again, only an argument *for* your point of view, not against the other. It may well be the case that for the classical approach to be applied to decision theory, it would take someone of infinite cleverness. But to be fair to the other side, as a matter of principle, we have to grant them that possibility. (But of course cleverness is not the same as intelligence, let alone wisdom, and Wald may have come to a certain view as to the wise course. I certainly would not attempt to apply classical methods to decision theory.)
The Bayesian approach clearly has much to recommend it, to judge by
results, but alas, IMO, there remains something wrong, in principle,
with the argumentation, whether old Bayesian (18'th century) or
neo-Bayesian.
Of course you can believe what you like, but if you wish anyone else
to take it seriously, you had better have something more to say. What
is wrong? Is there an error in Cox's theorem?
Well, I have written a whole book. I have not per se addressed Cox's theorem, but I see no reason to. Any theorem is only as good as the starting axioms, and I can see right from the get-go that "real numbers" are introduced in a way that precludes fuzziness as a separate category of uncertainty. Thus from my point of view, the core question -- namely how uncertainty in a constant model parameter may be represented -- has been begged.
My interest in the matter stems from my "Fuzziness and Probability" (ACG
Press, 1995), where I believe I have developed the key insight that may
lead to a resolution,
Resolution of what? Nothing needs resolving.
An experiment has been performed, and data are observed. A model is proposed. What do *the data* say about the model parameter? If your answer *requires* you first to posit a prior, and moreover to conceive of the parameter as a random variable in its own right, covarying with the random variable whose various outcomes have been observed, I think the classicists have a point that there is a problem. That was true in the 18-th and 19-th centuries, and it is still true after the best neo-Bayesian efforts. But Bayesian approaches nevertheless give good results (more or less), and the classical and Bayesian approaches may be made to agree (more or less) in a broad range of problems. Therefore there is prima facie an in-principle problem to resolve, namely a reconciliation of the competing views.
namely the idea that a possibilistic calculus for
the likelihood function may be developed that allows for elimination of
nuisance parameters (marginalization), for changes of variable, etc.
that allow for all the Bayesian ground to be covered, but without the
need of priors,
I can never understand what people mean when they imagine that they
can get by without priors.
You turn the question on its head. The problem of statistical inference is to deduce *from the data*, an estimation of unknown parameters of interest, function transformations thereof, and including a characterization of the associated uncertainty. The operative phrase is "from the data". In that regard, clearly the fundamental object from which inference should proceed is the likelihood function, as this encapsulates all that the data say about the unknown parameters of interest, under the assumed model. I do not disagree that "prior" information is sometimes relevant to a contemplated decision problem. In that case, such prior information could in principle be included with the data, and we are back to the problem of characterizing, alone, what the data say about the parameters of interest. The in-principle problem of inference should require no necessary reference to priors, and it is those who argue for the necessity of such who must justify it.
Btw, if Cox is used as justification, why do you need Bayes? Just rescale the likelihood function so it conforms to Cox, and off you go.
Suppose you want to estimate a function. If
you do not place constraints on this function, the task is impossible,
since a finite number of data points can never determine a function.
So you place constraints. But this is a prior!
No, this is a model. Let's not confuse the matter with the conceptual sleight of hand of using words to mean different things at different times.
Every example of linear
regression is a prior: no one told you the function was linear. That
is all a prior is; it is not some metaphysical monster.
Again, no.
And there are
well-developed methods for generating priors from certain types of
knowledge. More need to be developed, but this is a research topic.
I have never been persuaded that a prior should be any different, as a matter of existential category, from a likelihood -- that based on prior data.
and without the need to hold two contradictory concepts
in the head at the same time, namely the idea that the model parameters
are random variables, and constant, both.
'Random variable' is a mathematical notion; it has no physical
meaning. You are confused because you think that the word 'random'
means something when applied to the world (as opposed to mathematics),
and that probability theory concerns 'random' happenings. With the
possible exception of quantum mechanics, which does not concern us
here, nothing is random. It is merely unknown, to a greater or less
extent.
Actually, as the title of my book indicates, I have delved rather deeply into the modeling of uncertainty. In the standard formulation of the problem of statistical inference, there are at least two kinds present, namely (i) uncertainty of the "random" kind, as exemplified by the variation present in the data (random sample), and (ii) uncertainty of the fuzzy kind, as exemplified by the lack of absolute precision in our estimate of the model parameters consistent with the data. The classical school is at least clear that uncertainty in the value of a constant parameter is not of a random kind. Truth be known, our uncertainty in the model parameters is of the fuzzy kind. The Bayesian school would like all uncertainty to yield to a probability calculus, hence the necessity of introducing the prior, conceptually, as a way of transmuting the likelihood point function into posterior probability density. And hence your mini-lecture above on randomness. The orthodox(?) fuzzicists are IMO wrong on some things. However what they are right about is that not all uncertainty is probabilistic. At any rate, I've looked deeply into the line of demarcation between the two kinds of uncertainty, and in particular within the problem set-up of statistical inference. Where I come out is that there is an enormous simplification to be gained when it is recognized that there is -- along with the obvious randomness -- fuzziness at the heart of this inferential set-up, for one could then apply a possibilistic/fuzzy/likelihood calculus directly to the likelihood function, thereby achieving all the Bayesian benefits, but without the Bayesian conceptual inconsistency in its treatment of constants. That is the key to the "resolution" I spoke of earlier. You don't have to agree, of course.
Probablities summarize our degree of knowledge. If the
probability of a proposition is 0 or 1, we are certain that the
proposition is true or false. Otherwise we are not certain. So a
parameter is merely unknown and constant. We can reason about
propositions concerning this parameter (e.g. a = 1) as we can about
any other proposition, using probability theory.
But there is more that we can do when we adduce the insights of fuzzy set theory.
illywhacker;
Regards,
S. F. Thomas
.
- Follow-Ups:
- References:
- MADLY AMUSED
- From: Luis A. Afonso
- Re: MADLY AMUSED
- From: illywhacker
- The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- From: S. F. Thomas
- Re: The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- From: illywhacker
- Re: The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- From: S. F. Thomas
- Re: The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- From: illywhacker
- MADLY AMUSED
- Prev by Date: Re: Ordinal Regression
- Next by Date: Re: MLE
- Previous by thread: Re: The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- Next by thread: Re: The danger of classical hypothesis and significance tests [was Re: MADLY AMUSED]
- Index(es):