Re: Clustering Software
- From: "Reef Fish" <Large_Nassau_Grouper@xxxxxxxxx>
- Date: 10 Jun 2006 19:01:21 -0700
Richard Wright wrote:
On 10 Jun 2006 14:42:32 -0700, "Reef Fish"
<Large_Nassau_Grouper@xxxxxxxxx> wrote:
Richard Wright wrote:
Are you using 'primitive' as a description or a condemnation?
It was certainly a concise and accurate description.
You can imagine the rest anyway you iike.
In the evolutionary sense things can be primitive yet satisfactory.
just as there are things primitive that are no longer satisfactory.
The human hand is about as primitive a forelimb as you can find
you can still rub a stick to make a fire (when you have no other way)
or use a piece of rock for a cutting tool.
Good one, Bob. You snip part of what I said after that quote, and then
make the very same point in an attempt to rebut my truncated
statement!
The truncated statement added nothing to your point that something
primitive CAN be very useful, and my point was the parallel that
something primitive CAN be pretty useless too. What did YOU miss?
What I said was: "The human hand is about as primitive a forelimb as
you can find among mammals - primitive in the sense that it is close
to the ancestral vertebrate form and can perform diverse tasks -
picking up food, flying aircraft, and carrying out unmentionable
acts."
By contrast the horse's hoof is about as advanced/sophisticated as you
can get, but it can do almost nothing except run and kick.
What's so advanced/sophistated about a horse's hoof?
That wasn't meant to be a question but the lead in to my NEXT line
about the horse's mouth.
You're rather humor impaired, aren't you?
It is advanced because it has evolved far away from the ancestral
form. It is sophisticated because it is is highly adapted for its
(limited) function. The hands of members of this newsgroup are
primitive, but their brains are advanced.
It may or may not be desirable for a property to be primitive. It may
or may not be desirable for a property to be advanced. That evaluation
depends on adaptedness under constantly changing circumstances. BTW,
the customary evolutionary terms are 'primitive' and 'derived', but I
used the more intuitive word advanced because I suspected these
matters were out of your field.
I thought a horse's mouth is much more advanced and sophisticated
because EVERYONE wants to hear from the horse's mouth than
from some other source.
Whoooosh -- that escaped you completely didn't it? Could have saved
your two long paragraphs of pointess elaboration about the hoofs.
I am out of my field here,
You could've fooled me had you not been so honest to confess.
but am wondering this. Should I assume that
primitive equates with uselessness in clustering algorithms? Has the
average linkage algorithm been shown to be unsatisfactory in
applications?
Sorry you missed my previous posts on the subject, in which I had
challenged some poster(s) to name ONE single discovery in the
past decades that could be attributed to the use of the clustering
algorithms, or which the min, ave, and max are the most commonly
used ones and the CRUDEST ones.
Back to statistics. Sorry, I did miss the earlier posts. But I am
still none the wiser in relation to my particular question. Is there
a consensus that average linkage clustering produces worse results
(empirically) than more advanced methods?
Let me give you my answer this way: the question of consensus is
irrelevant. Whether the average linkage is BETTER or WORSE
than more advanced methods (whatever that means), they
accomplished the same "nothing". I am sure there is NO consensus
on that one.
But why should I need any one to agree with me?
I've put in DECADES of serious efforts in the subject, and had been
Program Chairman of National and International Meetings on
Classification and Clustering, referee for the Journal of Classifi-
cation for many years, while most of the other folks are just dabbling
beginners, thinking they know something, but don't really hadn't even
scratched the surface of the subject.
Please don't come back with a new thesis that CRUDE oil is such
an essential commodity that it's ruling the present world because
the US economy is going down the drain because of rising CRUDE
oil prices ... that "crude" is a desirable property just as "primitive"
is.
That's your crazy idea, not mine.
You really don't have any sense of humor, do you?
-- Bob.
Maybe I have misunderstood things and you are just
having a go at the ironical self-effacement that I saw in John's claim
about one-time sophistication.
On 10 Jun 2006 11:28:17 -0700, "Reef Fish"
<Large_Nassau_Grouper@xxxxxxxxx> wrote:
John Uebersax wrote:
My program, CLUSBAS, is now uploaded and can be retrieved here:
http://ourworld.compuserve.com/homepages/jsuebersax/clusbas.zip
Details:
1. Average-linkage, hierarchical cluster analysis
2. Interactive, very easy to use (two commands!)
3. User supplies similarity/dissimilarity matrix
4. Runs in DOS window (should be no problem)
5. Limited to 100 objects (designed for variable/item clustering),
but this can be increased.
6. Source code included (two versions: fortran and QuickBasic)
Note: this was once a very sophisticated mainframe program,
You're joking of course.
The "average" linkage has the same algorithm as the "minimum"
and the "maximum" linkage (the only difference is updating the
new cluster value by "average" rather than "min" or "max"), and
is the most primitive of ALL clustering algorithms.
1 2 3 4 (1,3) 2 4
(1,2,3) 4
1.00 .50 .76 .50 (1,3) 1 .585 .5 (1,2,3) 1 .5
.50 1.00 .67 .50 2 1 .5 ---> 4 1
.76 .67 1.00 .50 4 1
.50 .50 .50 1.00
which was what was done by your program on a similarity matrix of
correlations (since the self similarity is 1,00).
If the algorithm had been the MIN, the value corresponding to
.585 which is AVE (..5,.76) would have been MIN(.5, .76) = .5
and the MAX algorithm would have yielded MAX(.5,.76) = .76
and the similarity matrix reduces to a size one less, and the
identical algorithm continues until two groups are merged to 1.
Can't find a LESS sophiscated, or more simple-minded algorithm
than those three!
Even Afonso should be able program ALL THREE algorithms
in one (by offering a choice 1 for min, 2 for ave and 3 for max)
in about 15 minutes or less. :-)
-- Bob.
with fancy
CalComp plots and the whole works. I haven't fully ported it to PCs
because I thought something better would come along--but it really
hasn't.
However, if there's interest I can restore more of the original program
features in the PC version.
Sample input:
1.00 .50 .76 .50
.50 1.00 .67 .50
.76 .67 1.00 .50
.50 .50 .50 1.00
Sample output:
GROUP 1 IS JOINED BY GROUP 3. N IS 2 ITER = 1 SIM =
0.760
GROUP 1 IS JOINED BY GROUP 2. N IS 3 ITER = 2 SIM =
0.585
GROUP 1 IS JOINED BY GROUP 4. N IS 4 ITER = 3 SIM =
0.500
1 3 2 4
* * * *
1 ***** * *
* * *
2 ******* *
* *
3 ********
Hope this helps.
John Uebersax PhD
.
- References:
- Clustering Software
- From: Blah
- Re: Clustering Software
- From: John Uebersax
- Re: Clustering Software
- From: Reef Fish
- Re: Clustering Software
- From: Richard Wright
- Re: Clustering Software
- From: Reef Fish
- Re: Clustering Software
- From: Richard Wright
- Clustering Software
- Prev by Date: Re: Confidence Interval
- Next by Date: Re: Correlation Mean and Median Value?
- Previous by thread: Re: Clustering Software
- Next by thread: Re: Clustering Software
- Index(es):
Relevant Pages
|