Re: Finding useful functions- part 1

From: Michael Olea (oleaj_at_sbcglobal.net)
Date: 11/01/04


Date: Mon, 01 Nov 2004 19:44:28 GMT

in article Opmcr6AYuJhBFwk5@longley.demon.co.uk, David Longley at
David@longley.demon.co.uk wrote on 10/31/04 12:00 AM:

> In article <BDA99812.C01A%oleaj@sbcglobal.net>, Michael Olea
> <oleaj@sbcglobal.net> writes

[snip]

>>
>> Yesterday I read
>> Natural Kinds for the fourth time - maybe I am just slow, but I keep getting
>> more out of that one essay. Bear in mind that I actualy write cluster
>> analysis code, so the ideas he wrestles with in Natural Kinds are not just
>> abstractions, but matters of personal practical concern. Cluster Analysis is
>> after all an attempt to automate a rigorous elucidation of natural kinds.
>>
>
> The one thing one soon learns after using cluster analysis practically
> is that cluster analysis (agglomerative or divisive), *always* produces
> clusters (just like Factor analysis always produces factors (they are
> closely related).

Well, the statement that Factor Analysis always produces factors is another
way of stating that every finite vector space has a finite set of basis
vectors - the issue is whether or not the analysis yields a reduction in the
dimension of the basis (or, as in the case of JPEG, a useful tradeoff
between compression and fidelity), but yes, point taken. Any diligent
application of these tools, in the cluster analysis form of the problem,
therefore begins with an assesment of "clustering tendency", and concludes
with an investigation of "cluster validity".

The working assumption is that there is structure in the world, regularities
to be unveiled, not necessarily in tune with our "innate subjective spacing
of qualities". An interesting paper in which clustering (an agglomerative
hierarchical clustering) is based not on a metric, or similarity function,
but on mutual information is:

[A] E Schneidman, W Bialek, & MJ Berry II, An information theoretic
approach to the functional classification of neurons, in Advances in Neural
Information Processing 15, S Becker, S Thrun & K Obermayer, eds, pp 197-204
(MIT Press, Cambridge, 2003).

Available at:
http://www.princeton.edu/~wbialek/our_papers/schneidman+al_03a.pdf

/Start Abstract/

A population of neurons typically exhibits a broad diversity of responses to
sensory inputs. The intuitive notion of functional classification is that
cells can be clustered so that most of the diversity is captured by the
identity of the clusters rather than by individuals within clusters. We show
how this intuition can be made precise using information theory, without any
need to introduce a metric on the space of stimuli or responses. Applied to
the retinal ganglion cells of the salamander, this approach recovers
classical results, but also provides clear evidence for subclasses beyond
those identified previously. Further, we find that each of the ganglion
cells is functionally unique, and that even within the same subclass only a
few spikes are needed to reliably distinguish between cells.

\End Abstract\

> In many applications (apart from the nice concrete
> biological ones used to illustrate them) there's nothing natural about
> them <g>

This is probably how grue emeralds were first discovered ;-)

> They're descriptive statistical tools. cf. first website
> reference for the beginning of a long series which took just this line
> and then think "Fragments".

Back to the question I think Modlin was contemplating - are there processes
in primate cortex that are analogous in their operation to such tools. Or to
put it another way:

Atick JJ, Redlich AN (1992) What does the retina know about natural scenes?
Neural Computation, 4, 196-210.

Barlow HB (2001) Redundancy reduction revisited. Network: Computation in
Neural Systems, 12, 241-253.

Olshausen BA, Field DJ (1996a). Emergence of simple-cell receptive field
properties by learning a sparse code for natural images. Nature, 381,
607-609.

(and many more)

http://redwood.ucdavis.edu/bruno/
http://www-2.cs.cmu.edu/~lewicki/
http://www.cnbc.cmu.edu/~tai/
http://www.jneurosci.org/cgi/content/abstract/13/11/4700



Relevant Pages

  • Re: Finding useful functions- part 1
    ... >in article Opmcr6AYuJhBFwk5@longley.demon.co.uk, David Longley at ... >> The one thing one soon learns after using cluster analysis practically ... >approach to the functional classification of neurons, ... >cells can be clustered so that most of the diversity is captured by the ...
    (sci.cognitive)
  • Re: Cluster analysis
    ... I am working as a tec consultant for Statistica ... having two methods of cluster analysis: ... you can go for k-means clustering method ... Cluster Analysis.Both of these tools can be used for cluster analysis. ...
    (sci.stat.consult)
  • Re: clustering
    ... > going from a five cluster step to a 4 cluster step, ... into the subject of "cluster analysis" about a DECADE after I ... ALL existing clustering algorithms! ... In Art' second sentence, it is imperative to drop the word "easily". ...
    (sci.stat.edu)
  • Re: Clustering categorical data
    ... cluster analysis, ... your "scales" are internally symmetrical in wording, ... objects/cases here are "items", which are measured on ...
    (sci.stat.math)
  • Re: Cluster analysis on dataset with ordinal and nominal data
    ... The TWOSTEP cluster procedure in SPSS handles variables at different ... If you are using Likert scales, ... > 2) Is there any alternative method of cluster analysis of this dataset ...
    (sci.stat.edu)