Re: assumption of Classification




Richard Ulrich wrote:
> On 28 Apr 2005 21:05:34 -0700, "Reef Fish"
> <Large_Nassau_Grouper@xxxxxxxxx> wrote:
>
> >
> > Richard Ulrich wrote:
> > > On 26 Apr 2005 22:21:23 -0700, "Data Matter" <fungile@xxxxxxxxx>
> > > wrote:,
> "Data Matter"
> - don't be concerned with Bob's rant about me.
>
> He did not contradict anything I said, and he added some detail.

Really?

Exhibit 1.

RU> A classification tree that tries to break at every value
RU> will not care whether the distance between 1 and 10 is
RU> the same as the distance between 10 and 100 (or not).
RU> (It is going to have a lot of opportunity to over-capitalize
RU> on chance, so the N needs to be large.)

RF> Here, you're talking through your hat again. There are non-metric
RF> clustering methods and there are metric clustering methods, and

that contradicted your "distance between 1 and 10 is the same as
the distance between 10 and 100."

RF> there are HUNDREDS of clustering algorithms each having its own
RF> properties, requirements, and peculiarities. Your paragraph
RF> is nonsense.

If calling your paragraph categorical "nonsense" not a contradiction
of what you said, I don't know what would be.


Exhibit 2.

RU> normality is not the only assumption to be checked.
RU> It's always good to check.

RF> WHY? If the clustering model assumes nothing about normality?

Most of the clustering methods have NO distributional assumptions.
Doesn't my one-liner tell you something about you being wrong?

Normality should be checked vs Normality NOT to be checked if
a clustering methods doesn't assume it. That's a contradiciton.



Exhibit 3.

The Ulrich went into his characteristic tangent that had NOTHING to do
with clustering by referring to checking OLS Regression assumptions:

RU> For methods of ordinary least squares, normality is not
RU> as important as having decently behaved residuals

RF> The iid N(0, sigma^2) assumption about the ERRORS in a typical
model
RF> < ...>. This is UNDERGRADUATE stuff, Richard, and you said
RF> "normality" is not as important ...".

Normality is NOT vs Normality IS. That's a contradiction.


Exhibit 4.

RU> absence of outliers,

RF> Absence of outliers with respect to WHAT? (Normality of course).

RF> And they DO NOT pertain to "Classification" in the sense of
RF> "Numerical Taxonomy", "Clustering", and various agglomerative
RF> and divisive methods (algorithms) commonly used in CLUSTERING.

Check absence of outlier vs do NOT pertain to clustering
is a contradiction.


Richard patted his own shoulder in characterizing the above as

> He did not contradict anything I said, and he added some detail.


> Bob, when you read CLOSE enough to write your nastier
> answers, you don't bother to check the text to see
> that I wrote what you claimed. You screwed up again,

I cited you verbatim and myself verbatim above, in the Exhibits.
Where had I not read you correctly?


> "A new scientific truth does not triumph by convincing its
> opponents and making them see the light, but rather
> because its opponents eventually die, and a new generation
> grows up that is familiar with it."

A very nice philosophical tangent. You should have been more
down to earth and quoted my genuine and heart-felt advice to you
(as well as others) except those who post to seek advice:


RF> Stick to what you KNOW. And leave the BS to the barn in the
RF> farm. That would be the greatest contribution you can give to
RF> any statistical/mathematical readership, such as this one.

-- Bob.

.



Relevant Pages

  • Re: assumption of Classification
    ... >> Richard Ulrich wrote: ... >> RU> normality is not the only assumption to be checked. ... If the clustering model assumes nothing about normality? ... That's a contradiction. ...
    (sci.stat.edu)
  • Re: assumption of Classification
    ... Most clustering algorithms (k-means, ... > A classification tree that tries to break at every value ... normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: assumption of Classification
    ... Classification trees do not. ... Most clustering algorithms (k-means, ... single link, average link, etc.) do not. ... normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: assumption of Classification
    ... Most clustering algorithms (k-means, ... > clustering algorithms which assumes that each cluster is multivariate ... A classification tree that tries to break at every value ... normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: Cluster Analysis with complex surveys
    ... Richard Ulrich wrote: ... > Maybe someone will comment who likes clustering, ... Cluster analysis is unstructured in the sense of seeking clusters ... on the basis of similarities or dissimilar, ...
    (sci.stat.edu)