Re: Cluster analysis for beginners
- From: Jerry Dallal <gdallal@xxxxxxxxxxxxxxxxxxxx>
- Date: Fri, 30 Mar 2007 07:52:17 -0400
illywhacker wrote:
On Mar 30, 1:36 am, Jerry Dallal <gdal...@xxxxxxxxxxxxxxxxxxxx> wrote:illywhacker wrote:On Mar 29, 4:38 pm, David Winsemius <doe_s...@xxxxxxxxxxx> wrote:This is a "joke", of course, that results from thinking of P values asSidney <milan_y...@xxxxxx> wrote innews:24466740.1175159875339.JavaMail.jakarta@xxxxxxxxxxxxxxxxxxxxxx:1) Classical hypothesis testing is fatally flawed. No well-defined
alternative is specified, and the probability of the data is not
calculated. Rather the probability of a set of unobserved data points
is
calculated. As Jeffreys famously put it: "A hypothesis that may be
true may
be rejected because it has not predicted observable results that have
not
occurred". There is a mass of literature on this.
posterior probabilities. If P values are thought of in terms of fixed
level tests, Jeffreys' comment makes no sense.
As I believe someone has replied to you before now: calling it a
'joke' may save you the trouble of bothering to think too hard about
its implications for your practice, but it does not, alas, remove the
force of the remark.
In fixed level testing, one picks a level at which to perform a test. This implies a test statistic and a basis for choosing a critical region, a set of outcomes that have a probability under the null equal to the level of the test. The selection of a critical region implies alternative hypotheses.
Then, the data are collected and the analyst merely looks to see whether the outcome falls into the critical region. In is in this sense that Jeffreys' comment has no meaning. Of course the frequentist is concerned about outcomes that haven't happened. That's what the critical region is about. Even the Bayesian understands it, despite disagreeing with the approach.
A P value can be defined as the smallest level of significance for which the result of the data collection will fall into a "similarly constructed" critical region (for example, X>k for some k determined by the level of the test). Stick with this definition, and Jeffreys' comment is blunted.
If, on the other hand, one defines a P value as "the probability of events as or more extreme", one plays straight man to Jeffreys' jokester: "Why should I care about the probability of events I haven't seen?!"
Hypothesis testing without an alternative will always be flawed,
because there is always at least one model that predicts the data (or
the sets of unobserved data that classical hypothesis testing likes to
calculate with) with certainty, and which will therefore always be
better than any other hypothesis. Why should we discard this model?
Prior knowledge of course. And if prior knowledge about this model,
why not others? And now the whole thing is up in the air.
Asking whether the clusters are "significant" is too vague to answer. I
suspect what the OP meant was whether the clusters are "remarkable".
This is a joke too right? You are replacing one undefined word with
another. This is indeed both remarkable and significant.
"Significant" is a word that carries baggage. "Remarkable", "extremee", "unexpected" and a host of others get the point across better. My take is that the OP is asking whether the observed clusters can be shown to be "remarkable", "extreme", "unexpected". In order to do that, one must be prepared to say what they are "remarkable", "extreme", or "unexpected" with respect to. One could stick with "significant", but that is more likely to raise the issues of "statistical significance" and "practical significance", which are best set aside for the moment.
illywhacker;.
- Follow-Ups:
- Re: Cluster analysis for beginners
- From: illywhacker
- Re: Cluster analysis for beginners
- From: illywhacker
- Re: Cluster analysis for beginners
- References:
- Cluster analysis for beginners
- From: Sidney
- Re: Cluster analysis for beginners
- From: David Winsemius
- Re: Cluster analysis for beginners
- From: illywhacker
- Re: Cluster analysis for beginners
- From: Jerry Dallal
- Re: Cluster analysis for beginners
- From: illywhacker
- Cluster analysis for beginners
- Prev by Date: Re: SOLVE IT
- Next by Date: Re: Is this a Meta-Analysis?
- Previous by thread: Re: Cluster analysis for beginners
- Next by thread: Re: Cluster analysis for beginners
- Index(es):
Relevant Pages
|
|