Re: assumption of Classification



On 26 Apr 2005 22:21:23 -0700, "Data Matter" <fungile@xxxxxxxxx>
wrote:

> He's asking if these procedures make distributional assumptions.
> Classification trees do not. Most clustering algorithms (k-means,
> single link, average link, etc.) do not. However, there is a class of
> clustering algorithms which assumes that each cluster is multivariate
> normal and then proceed to find the means and covariances of these
> clusters.

A classification tree that tries to break at every value
will not care whether the distance between 1 and 10 is
the same as the distance between 10 and 100 (or not).
(It is going to have a lot of opportunity to over-capitalize
on chance, so the N needs to be large.)

A classification tree that uses the mean will have some of
the same difficulty that "link" clustering does, if it wrongly
assumes that equal measures of intervals are equivalent.

>
> Nonetheless, normality is not the only assumption to be checked. Every
> method has its own list of assumptions and you should make sure that
> your data agree with the method you choose.
>

It's always good to check.

For methods of ordinary least squares, normality is not
as important as having decently behaved residuals - mainly,
absence of outliers, absence of pattern. And that behavior
matters for the *tests*, not for carrying out the fit.
[ ... ]

--
Rich Ulrich, wpilib@xxxxxxxx
http://www.pitt.edu/~wpilib/index.html
.



Relevant Pages

  • Re: assumption of Classification
    ... >> Richard Ulrich wrote: ... RF> clustering methods and there are metric clustering methods, ... RU> normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: assumption of Classification
    ... Most clustering algorithms (k-means, ... > A classification tree that tries to break at every value ... normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: assumption of Classification
    ... Classification trees do not. ... Most clustering algorithms (k-means, ... single link, average link, etc.) do not. ... normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: 2 Questions: Manova and Selecting features
    ... Computes a Multivariate Analysis of Variance for equal or unequal ... Statistical power of a performed single-factor MANOVA. ... Many clustering algorithms are ... Thus, although clustering algorithms are ...
    (comp.soft-sys.matlab)
  • Re: 2 Questions: Manova and Selecting features
    ... Also you should verify the following MANOVA files ... Computes a Multivariate Analysis of Variance for equal or unequal ... Many clustering algorithms are ... Thus, although clustering algorithms are ...
    (comp.soft-sys.matlab)