Re: assumption of Classification



He's asking if these procedures make distributional assumptions.
Classification trees do not. Most clustering algorithms (k-means,
single link, average link, etc.) do not. However, there is a class of
clustering algorithms which assumes that each cluster is multivariate
normal and then proceed to find the means and covariances of these
clusters.

Nonetheless, normality is not the only assumption to be checked. Every
method has its own list of assumptions and you should make sure that
your data agree with the method you choose.


Richard Ulrich wrote:
> On 25 Apr 2005 11:49:26 -0700, wu_cheng2001@xxxxxxxxxxx (apple0811)
> wrote:
>
> > Hi, everyone,
> >
> > I am doing classification of voice signal.
> >
> > Since the voice signal is very virable[*], for sure it is not
normal.
>
> Being very "variable"* usually means that a feature has
> a large standard deviation, and that's easily possible
> with Normal data.
>
> Highly skewed? Discrete? Multi-modal? - those are more
> precise descriptions.
>
> >
> > Can I still use clustering analysis, tree method, QDA, RDA method
to classify?
> >
> > In fact, I have already applied these method, but I haven't check
normality.
> >
> > I am afraid these method are wrong for non-parameter[*] data.
>
> "Non-parametric* data" is not a very useful term.
> In my experience, the user sometimes should have said that
> he expects to use ranks, and sometimes should have said
> that there are discrete categories. - And you seem to be
> saying something otherwise.
>
> --
> Rich Ulrich, wpilib@xxxxxxxx
> http://www.pitt.edu/~wpilib/index.html

.



Relevant Pages

  • Re: assumption of Classification
    ... Most clustering algorithms (k-means, ... > A classification tree that tries to break at every value ... normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: assumption of Classification
    ... >> Richard Ulrich wrote: ... RF> clustering methods and there are metric clustering methods, ... RU> normality is not the only assumption to be checked. ...
    (sci.stat.edu)
  • Re: finding the centre of a cluster
    ... for classification problems involves ... predictor variables are continuous, ... Clustering is an unstructured problem in which you DON'T KNOW even ... I have more experience with neural networks and decision trees than you do. ...
    (sci.stat.math)
  • Re: HELP!!!
    ... can be improved, significantly, if supervised clustering ... *If*, that is, the class labels are correct. ... The "Gold Standard" is classification by a trained experienced human ... classification methods have an accuracy significantly exceeding the ...
    (comp.soft-sys.matlab)
  • Re: HELP!!!
    ... I need to perform some classification on the first raw of ... and perform the classification. ... clustering a mixture of multiple class data ... corresponding MATLAB code is readily available. ...
    (comp.soft-sys.matlab)