Re: basque and circassian

From: Miguel Carrasquer (mcv_at_wxs.nl)
Date: 02/21/05


Date: Mon, 21 Feb 2005 19:16:05 GMT

On Mon, 21 Feb 2005 18:13:43 +0000, Richard Herring <junk@[127.0.0.1]> wrote:

>In message <421807BF.D67@alphalink.com.au>, Jacques Guy
><jguy@alphalink.com.au> writes
>>
>Lost for words?

I thought so too, but yesterday I was experimenting with different
Usenet servers, and I saw the explanation.
I never got to see the binary posting.

For the benefit of other who may have had the same problem,
I'm reposting Jacques' message.

===========================================================

I give up!

For some obscure reason, when I paste the text
of my post, all is well on my screen, but nothing
shows up on the newsgroup. I suspect it's because
some lines (branches of the two opposing trees)
are too long.

Well, no, not yet. I know posting binaries is
a no-no, but it's only 11k. So I am attaching
the zip file of my intended post. Take it as
a sort of test--and I'm getting sick of it all.

<attachment: SFORZA.ZIP, containing the following:>

P.T. Daniels had written:

The alleged correlation between genes and language is fraudulent (as I
have said in print and as Jacques went to a great deal of trouble to
spell out explicitly in an early number of LINGUIST List (I don't know
if this link will still get you there, but it's LINGUIST List 3.81 if it
doesn't)): http://linguistlist.org/issues/3/3-81.html#1

I checked the above link, and the article is there, but the trees
are mangled beyond undertanding.

So let me try posting this, which I retrieve from an old, old
backup CD.

One of these days I shall have to put it up somewhere on
the Net, on some Web page.

Here it is:

These are two articles which I wrote last year -- or was it
two years ago -- and sent to the moderated group "LINGUIST"
(linguist@tamvm1.tamu.edu). In a nutshell Cavalli-Sforza and the late Wilson
have ignored well-known data that did not conform to their
hypotheses, and Sforza's conclusion about language and genetics
contradict the very data he presents. Here are those articles.

  IS THERE A CORRELATION BETWEEN GENES, PEOPLES AND LANGUAGES?

             (Cavalli-Sforza's hypotheses examined)

In November 1991 Scientific American published an article by Prof.
Cavalli-Sforza entitled "Genes, Peoples and Languages" in which
was claimed, to summarize the author's findings in his own words:

  Our [genetic] reconstruction finds striking parallels in a
  recent classification of languages. Genes, people and
  languages have thus diverged in tandem.

The author's thesis seems to have caused a bit of a ruckus, to say the
least, in linguistic circles at the time, ruckus which has not
completely abated yet.

The core of the evidence is illustrated in two facing trees, on pages
76 and 77 of the Scientific American article, reproduced in Figure 1,
and headed by this caption: "Correlation of Peoples and Languages".

This dual tree lists 38 populations, classified genetically; in its
linguistic part proper 19 language families are listed [note 1].

There is not much overwhelming agreement among linguists about the
classification of world languages, especially at the very large family
level used by Cavalli-Sforza so that it would be rather futile
to argue over the inclusion or omission of this or that language
group in this or that part of the tree presented.

Let us, then, grant the author's linguistic and genetic trees, and
taking them as correct and true, examine whether and how strongly the
evidence presented therein shows a correlation between peoples and
languages.

Correlation of Peoples and Languages

            Genetics Populations Linguistic Families

              ______.---------- Mbuti Pygmy ----(Original language unknown)
             | |____.----- W.African ---.__ Niger-Khordofanian
  ___________| |__.-- Bantu ---'
 | | `-- Nilotic ------ Nilo-Saharian
 | |__.-------------- San(Bushman) ------ Khoisan
 | `-------------- Ethiopian ---.
 | .------ Berber ---|-- Afro-Asiatic---.
 | .--| _.-- SW Asian ---' |
 | __| `-| `-- Iranian ---. |
 | | | `---- European ---|__ Indo-Europ.____|
 | .--| `--------- Sardinian ---| ====|============.
 | ____| |_____.------ Indian ---' | |
 | | | `------ SE Indian ------ Dravidian -----| |
 | | `--------------- Lapp ---.__ Uralic ________| |
 | | _____.-- Samoyed ---' ========|============|
 | | __| `-- Mongol ---. |-Nostratic |
 | | | | .----- Tibetan ** | ** Sino-Tibetan | |==Eurasiatic
 | _| .--| `--|__.-- Korean ---| | |
 | | | | | `-- Japanese ---|__ Altaic ________| |
 | | | .--| `----------- Ainu ---| ========:============|
 | | | | | .----------- Siberian ---' : |
 | | | | `--|__.-------- Eskimo ------ Eskimo-Aleut ==:============|
 | | `--| `-------- Chukchi ------ Chukchi-Kam. ==:============'
 | | | __.-------- S.Amerind ---. :
 | | | .--| `-------- C.Amerind ---|-- Amerind .......:
 | | `--| `----------- N.Amerind ---'
 |______| `-------------- NW Amerind ------ Na-Dene -------
        | .--------- S.Chinese ****** Sino-Tibetan
        | |______.-- Mon-Khmer ------ Austro-Asiatic-.
        | _____| `-- Thai ------ Daic-----------|
        | | |--------- Indonesian ---. |-Austric
        | _____| |--------- Malaysian ---| |
        || | `--------- Philippine ---|-- Austronesian --'
        || |.-------------- Polynesian ---|
        || `|___.---------- Micronesian ---'
         | `---------- Melanesian ---.__ Indo-Pacific --
         |___.----------------- New Guinean ---'
             `----------------- Australian ------ Australian ----

(Linguistic classification from Merritt Ruhlen, A GUIDE TO THE WORLD'S LANGUAGES)

                             Figure 1

Let us examine the linguistic tree first.

>From the outset, it is necessary to clarify the caption "Linguistic
Families" which heads the linguistic tree proper. The leftmost
branches of the linguistic tree do not correspond to any linguistic
classification, but they are merely lines used to connect a language
family (e.g. Indo-European) to the populations that speak it (e.g.
Iranians, Europeans, Sardinians, Indians). There is, of course, and
all linguists will agree on this for once, no such thing as a
Sardinian or a European branch of Indo-European, and as the caption
above them clearly says, those terms refer to populations.
The final branchings of the linguistic tree, then, do not represent
linguistic, but demographic data, and this fact must be kept clearly
in mind when seeking correlations between language families and
genetic populations.

Next, two branches marked "Sino-Tibetan" occur in widely separated
places in the linguistic tree. There should be only one such branch
that should link directly the Tibetan and South Chinese populations,
just like, for instance, the Indo-European branch is linked to the
Iranian, European, Sardinian, and Indian populations, who are speakers
of Indo-European. However, since Tibetans and Southern Chinese are so
wide apart on the genetic tree, they could not be linked directly
without crossing branches of the linguistic tree and making it rather
difficult to peruse.

Finally, we must clarify the status of the linguistic tree near its
origin. There are two, conflicting trees superimposed there,
representing two conflicting linguistic hypotheses. One is the
Nostratic hypothesis according to which the language families from
Afro-Asiatic to Altaic and perhaps up to Amerind, are ultimately
related, forming a Nostratic superfamily. According to the other
theory, it is the language families extending from Indo-European to
Chukchi-Kamchatkan that constitute another superfamily: Eurasiatic.
Those two hypotheses are highly speculative and neither is widely
accepted at all. Likewise, the reality of the Austric superfamily,
shown as englobing Austro-Asiatic, Daic, and Austronesian, is
disputed. It would be an unfair test that sought to match genetic
groupings against linguistic groupings about which linguists
themselves are at best unsure and at worst bitterly divided, and that
part of the tree dealing with those three superfamilies, accordingly,
ought to be abstracted. Thus the dual tree of Fig. 1 is equivalent to
the more easily understood graph of Fig. 2, consisting of the original
genetic tree, with the families of languages spoken by the various
populations simply added next to the population names.

It is immediately evident that if there is a correlation between
language and genes, it is not a perfect fit. For instance, the first
two closest populations in the genetic tree, the Bantu and the
Nilotic, speak languages belonging to two separate families,
Niger-Khordofanian and Nilo-Saharian, which remain quite unrelated
whether or not we subscribe to the theories of the Nostratic or
Eurasiatic schools (see Fig. 1). Scanning the tree further, we come
across the Afro-Asiatic-speaking Ethiopians, who are genetically
maximally distant from the other two Afro-Asiatic-speaking
populations, the Berbers and Southwestern Asians (i.e. the
populations of the Middle-East). The Southwest Asians themselves are
genetically closest to the Iranians who speak not Afro-Asiatic like
them, but Indo-European; while another Indo-European-speaking
population, the Indians, are more closely related to the Dravidian
speakers of Southeast India than to any other Indo-European speakers
(Iranian, European, or Sardinian).

            Genetics Populations Linguistic Families

              ______.---------- Mbuti Pygmy (Original language unknown)
             | |____.----- W.African Niger-Khordofanian
  ___________| |__.-- Bantu Niger-Khordofanian
 | | `-- Nilotic Nilo-Saharian
 | |__.-------------- San(Bushman) Khoisan
 | `-------------- Ethiopian Afro-Asiatic
 | .------ Berber Afro-Asiatic
 | .--| _.-- SW Asian Afro-Asiatic
 | __| `-| `-- Iranian Indo-European
 | | | `---- European Indo-European
 | .--| `--------- Sardinian Indo-European
 | ____| |_____.------ Indian Indo-European
 | | | `------ SE Indian Dravidian
 | | `--------------- Lapp Uralic
 | | _____.-- Samoyed Uralic
 | | __| `-- Mongol Altaic
 | | | | .----- Tibetan Sino-Tibetan
 | _| .--| `--|__.-- Korean Altaic
 | | | | | `-- Japanese Altaic
 | | | .--| `----------- Ainu Altaic
 | | | | | .----------- Siberian Altaic
 | | | | `--|__.-------- Eskimo Eskimo-Aleut
 | | `--| `-------- Chukchi Chukchi-Kamchatkan
 | | | __.-------- S.Amerind Amerind
 | | | .--| `-------- C.Amerind Amerind
 | | `--| `----------- N.Amerind Amerind
 |______| `-------------- NW Amerind Na-Dene
        | .--------- S.Chinese Sino-Tibetan
        | |______.-- Mon-Khmer Austro-Asiatic
        | _____| `-- Thai Daic
        | | |--------- Indonesian Austronesian
        | _____| |--------- Malaysian Austronesian
        || | `--------- Philippine Austronesian
        || |.-------------- Polynesian Austronesian
        || `|___.---------- Micronesian Austronesian
         | `---------- Melanesian Indo-Pacific
         |___.----------------- New Guinean Indo-Pacific
             `----------------- Australian Australian

                         Figure 2

1. Sino-Tibetan.

Sino-Tibetan is shown as spoken by Tibetans and Southern Chinese [note
2]. Tibetans, however, are shown in the genetic tree to be related to
(from closest to remotest relatives): Koreans and Japanese (Altaic);
Samoyeds (Uralic) and Mongols (Altaic); then to Ainus (Altaic); next
to Siberians (Altaic), Eskimos (Eskimo-Aleut) and Chukchis
(Chukchi-Kamchatkan); then to speakers of the great families of
American Indian languages (Na-Dene and the rest, lumped here under
"Amerind"); and finally to the Chinese. In other words, the only
populations more distantly related to the Tibetans than their fellow
Sino-Tibetan speakers are those found in Africa: Mbuti, West African,
Bantu, Nilotic, Bushmen, Ethiopian.

Traversing the genetic tree in the same manner as has just been done,
now visiting the noded that connect the Southern Chinese to the
Tibetans, one finds that the closest relatives of the
Sino-Tibetan-speaking Southern Chinese are, again in order of
increasing genetic distance: the Mon-Khmer, and Thai and Malay
populations, speakers of three distinct language families (Mon-Khmer,
Thai and Austronesian); then the Polynesians, Micronesians and
Melanesians, speakers of Austronesian and of "Indo-Pacific" (more
properly: Non-Austronesian); next the New-Guineans (Non-Austronesian)
and Australian aborigines (Australian); after which only then do we
reach the Chinese.

The correlation between Sino-Tibetan and genetics is thus strongly
negative if anything.

2. Afro-Asiatic.

This is the language family also known ascalled, more transparently,
Hamito-Semitic. Afro-Asiatic is spoken by the Ethiopians, Berbers,
North Africans, and Southwest Asians (by which term Cavalli-Sforza
appears to refer the populations of the Middle East).

The closest relatives, genetically, of the Ethiopians are the San
Bushmen, sole speakers of Khoisan; then, again in order of decreasing
relatedness: Mbuti Pygmies, speakers of an isolate, West Africans
and Bantus, speakers of Niger-Kordofanian, and Nilotic speakers of
Nilo-Saharan; next, to connect Ethiopians to their fellow Afro-Asiatic
speakers of North Africa and the Middle East, we have to pass through
the origin of the tree. Thus the Ethiopians are maximally distant
genetically from their fellow Afro-Asiatic speakers. The correlation
here between genes and language is maximally negative.

Consider now another Afro Asiatic-speaking population: the Southwest
Asians. Their closest genetic relatives are the Iranians, speakers, of
course, of Indo-European. Their next closest relatives, the
"Europeans", are again Indo-European speakers. Only then do they meet
with their Berber and North-African fellow Afro-Asiatic speakers.
Thus the genetic evidence presented shows Middle Eastern populations
as closer relatives of Indo-European speakers than of their own. A
negative correlation again.

3. Indo-European.

Four populations only are listed as speaking Indo-European: Iranians,
Europeans, Sardinians, and Indians. The Iranians, we have seen, are
most closely related to the Afro-Asiatic speakers of the Middle East;
the Europeans (presumably Romance, Germanic and Slavic speakers) are
more closely related to the Iranians (I-E), Middle Easterners, Berbers
and North Africans (all three Semitic speakers) than they are to the
Romance-speaking Sardinians. The Indo European-speaking Indians
themselves have for closest relatives the Dravidian speakers of South
India, and are no more closely related to other Indo-European speakers
than they are to Afro-Asiatic speakers. Thus, out of four
Indo-European populations, none has for closest relative another
speaker of Indo-European.

4. Uralic.

Only two member populations here: the Lapps, who are Caucasoids related
to the Hamito-Semitic, Indo-European and Dravidian speakers of
North-Africa, the Middle East, Europe and the Indian continent; the
Samoyeds, who are relatives of Asian and American populations
representing seven different great language families: Sino-Tibetan,
Altaic, Eskimo-Aleut, Chukchi-Kamchatkan, Amerind and Na-Dene. The
correlation between language and genetics is here again inexistent
or negative.

5. Altaic.

Five member populations: Mongols, Koreans, Japanese, Ainus, and
Siberians. As already seen, the Mongols' closest relatives are the
Uralic-speaking Samoyeds. Within these five, the only Altaic speakers
more closely related to each other than to a linguistic outsider are
the Koreans and the Japanese; but they are not more closely related to
the remaining Altaic speakers than they are to the Tibetans
(Sino-Tibetan) and Samoyeds (Uralic). The Siberians are closer
relatives to the Eskimos and Chukchis (Eskimo-Aleut and
Chukchi-Kamchatkan) than to any Altaic speakers; the Ainus are no more
closely related to the Koreans, Japanese and Mongols than they are to
the Tibetans and Samoyeds. Once again, no correlation.

6. Amerind.

The three populations listed are indeed all more closely related to
one another than to any linguistic outsider.

7. Austronesian.

We have there five Austronesian-speaking populations: Indonesians,
Malays, Philippinos, Polynesians and Micronesians. Indonesians, Malays
and Philippinos are shown in the chart as equally closely related to
one another as to the Sino-Tibetan speakers of South China, the
Austroasiatic-speaking Mon-Khmer, and the Daic-speaking Thai. The
Austronesian-speaking Micronesians have for closest relatives not the
Polynesians (also Austronesian speakers) but the Melanesians, who are
given as speakers of Indo-Pacific. Again, no correlation.

8. Indo-Pacific.

Only two populations here: New-Guineans, whose closest relatives are
the Australian aborigines, members of an isolate language family
(Australian); then the Southern Chinese (Sino-Tibetan), Mon-Khmer
(Austroasiatic), Thai (Daic), the five Austronesian-speaking
populations listed, and finally the Melanesians (Indo-Pacific).
The other Indo-Pacific population are the Melanesians, whose closest
relatives are the Austronesian-speaking Micronesians, and next
Sino-Tibetan, Austroasiatic, and Daic speakers. The correlation here
between language and genes is again inexistent, or negative.

9. Niger-Kordofanian.

Two member populations: the West Africans and the Bantu. The Bantu's
closest relatives are not the West Africans, but Nilotic populations,
speakers of Nilo-Saharian, an isolated language family. Once again, no
correlation.

Still remain ten language families to examine, namely:

The Mbuti Pygmies' unnamed isolate.
Nilo-Saharan.
Khoisan.
Dravidian.
Eskimo-Aleut.
Chukchi-Kamtchatkan.
Na-Dene.
Austroasiatic.
Daic.
Australian.

Each of those ten language families being represented by only one
population, there is nothing there to correlate: one cannot correlate
a single observation to anything.

Thus, in 10 language families out of the 19 used by the author, there
is nothing to correlate with the genetic data.

Of the 9 remaining language families we have observed only one case
where language and genetics concord: the American Indians (Na-Dene
speakers excluded). In the other 8 language families we have observed
either a total absence of correlation, or even a strongly negative
correlation in two cases: Afro-Asiatic and Sino-Tibetan.

                              FOOTNOTES

[note 1] The author had to repeat Sino-Tibetan in two different
positions in his tree, because it is spoken by two genetically
widely-divergent populations: Tibetans and Southern Chinese.

[note 2] One may wonder at the absence of the rest of the Chinese
population.

------------------------------------------------------------------------------

               GENES, PEOPLES AND LANGUAGES?
       (Scientific America, November 1991, pp.72-78)

       Cavalli-Sforza's hypotheses examined (Part II)

The reconstruction of genetic trees, like the one in the
article in question, and of the divergence of language
families are but particular applications of a much more
general problem:

Given a tree-shaped transmission network, a message is input at one
node, from which it travels to the terminal nodes without
backtracking. The network is noisy, that is, transmission is subject
to errors. The problem is: from the garbled versions of the original
message collected at the terminal nodes, reconstruct the network and
the amount of errors on each of its arcs (branches, if you prefer).

Replace "error" by "mutation", and "message" by "DNA genes" or
"mitochondrial genes" and you have a genetic model. Replace "error" by
"innovation" or "drift", and "message" by "basic sample vocabulary" or
whatever other data you see as representative of the languages in the
family being reconstructed, and you have a linguistic model.

You will have noticed that no mention is made in there of the rate of
change, be it genetic mutations or linguistic innovations, contrary to
Cavalli-Sforza and Wilson, who both posit a mutation rate constant in
time and across the different human populations. Likewise, Swadesh
posited a universal rate of vocabulary replacement also constant
through time and across languages when he proposed his theory of
glottochronology, later re-christened "lexicostatistics". Readers
familiar with these two methods will have been struck by the remarkable
similarity of Sforza's and Wilson's methods with glottochronology and
lexicostatistics, even down the terminology: "drift", "constant rates",
etc.

Yet, the assumption of a constant universal rate is completely
unnecessary for reconstructing trees, genetic or linguistic. Here is a
tree reconstructed under the assumption that the innovation rate of
linguistic features (be they lexical, grammatical, or whatever) varies
in time and across languages (I have deleted the language names, which
are irrelevant to the point, as you will later see):

          _95.-90----- (1)
  _88.-62| `-94----- (2)
 | | `-64--------- (3)
 | `-61------------- (4)
 | .-42------------- (5)
 | | _96.-78----- (6)
 | |-75| `-76----- (7)
 | | `-71--------- (8)
 | | _93.-87- (9)
 |-84| .-93| `-87- (10)
 | | | `-64----- (11)
 | `-87|-75--------- (12)
 | | _99.-86- (13)
 | `-82| `-78- (15)
_| `-87----- (14)
 |_94.-37------------- (16)
 | `-54------------- (17)
 | .-78--------- (18)
 | .-91|_89.-93----- (19)
 |-64| `-86----- (20)
 | |_86.-75--------- (21)
 | `-80--------- (22)
 | _79.-94--------- (23)
 | | `-80--------- (24)
 | | .-80--------- (25)
 |_70| | .-94----- (26)
     |_82|-85|_99.-96- (27)
         | `-97- (28)
         |-74--------- (29)
         `-66--------- (30)

The figures along each branch (arc, if you prefer)
represent the percentage of the message which has been
correctly transmitted. And here is the true tree:

          _87.-90----- (1)
  _87.-70| `-95----- (2)
 | | `-70--------- (3)
 | `-60------------- (4)
 | .-43------------- (5)
 | | _96.-78----- (6)
 | |-77| `-77----- (7)
 | | `-70--------- (8)
 | | _92.-88- (9)
 |-80| .-92| `-87- (10)
 | | | `-65----- (11)
 | `-84|-75--------- (12)
 | | .-85----- (13)
 | `-79|-89----- (14)
_| `-78----- (15)
 |-34----------------- (16)
 |-51----------------- (17)
 | .-79--------- (18)
 | .-91|_88.-91----- (19)
 |-64| `-88----- (20)
 | |_86.-79--------- (21)
 | `-78--------- (22)
 | _79.-93--------- (23)
 | | `-80--------- (24)
 | | .-83--------- (25)
 |_71| | .-95----- (26)
     |_82|-85|_98.-95- (27)
         | `-98- (28)
         |_94.-76----- (29)
             `-71----- (30)

This last tree is artificial. It is part of the output from a computer
simulation. Another part is the log of the history of the tree which,
for each split, gives its date, the number of items innovated since the
previous split, and lists them. The last part, computed from the
latter, is, to use a lexicostatistical term, a matrix of "percentages
of shared cognates" (it could equally well be "shared genes"), from
which the reconstructed tree was computed.

In possession of such data, it is interesting to test various
reconstruction methods (clustering algorithms, if you prefer) by
observing how well their resulting trees fit the true tree. Classical
lexicostatistical methods all yield poor results. One of the worst
amongst the many clustering algorithms I once tested is the
minimal-spanning tree method, which, incidentally, is precisely what
Cavalli-Sforza used for the genetic tree p.76 of the article in
question:

    "Essentially, this concept describes the tree having the
    smallest total branch length"

                               (p.73, col.2)

The minimal-spanning tree method, applied blindly, also tends to
reconstruct trees with only binary splits. Observe the genetic tree,
p.76 of the Scientific American article: all binary splits, except for
the one and only four-way split of the Southern Chinese, Indonesian,
Malaysian, and Philippino populations. You have there a typical
artifact of a clustering algorithm about 30 years old, as the author
himself admits:

    "An example is furnished by a tree linking 15
    populations that Anthony F.W. Edwards, now at Cambridge,
    and I published 27 years ago"

               (p.73, col. 2, immediately above the former
               quotation)

Observe now the reconstructed tree above, and you will see, instead,
two, three, four and five-way splits [Footnote 1].

Cavalli-Sforza posits this model of genetic evolution, which is exactly
similar to that of glottochronology:

    "The evolutionary model we used is the simplest. It
    predicts that the branches will evolve equally fast".

                                  (p.73, col. 1)

    "The mitochondrial clock is based on the number of
    mutations that have accumulated.... Whereas we
    hypothesized that our gene frequencies had drifted at
    constant rates, the Wilson group hypothesized that their
    mitochondrial genes had mutated at constant rates."

                              (p.74, col.2)

Yet, Cavalli-Sforza is aware that of the weaknesses of his model:

    "If one assumes that the rate of evolutionary change is
    constant along all branches, one can equate their
    lengths to the time elapsed since they diverged. Such
    rooted trees may also be subject to biases, however, if
    some branches have undergone more rapid evolutionary
    change than others".

                            (p.74, bottom of col.2)

He is also aware that there exist methods unaffected by
unequal and varying evolutionary rates:

    "Mathematical techniques of population genetics can
    minimize biases by accurately predicting rates of
    evolution".

(It is precisely such techniques which I used in the
reconstructed tree at the beginning of this article. The
word "predicting" in the quotation is a misnomer. The
correct word is "estimating").

Why, then, does he not use those robust methods? He does not
say [Footnote 2]. Extraordinary indeed, for the "gene map"
on p.74 of the article in question shows a glaring piece of
evidence that evolutionary rates fluctuate wildly. Consider
Iceland on that map. It has been colored dark green, showing
that from 0% to 1% of its population is Rh-negative. The
population of Iceland is about two-third of Scandinavian and
one-third of Irish descent. On that same map, Scandinavia
Ireland, and the British Isles show from 16% to 25% and
above Rh-negative. The other populations with a proportion
of Rh-negative individuals similar to Iceland occupy
Madagascar, Australia and New-Zealand, and the eastern half of Asia.
I may perhaps be forgiven for having believed, upon seeing
that map, that Iceland had been mistaken for Eskimo-
populated Greenland. Not at all. I went to the considerable
trouble of verifying the Icelandic data. Considerable,
because Cavalli-Sforza does not give his sources. And it appeared
that the aberrant case of Iceland is not only well-known
among geneticists, but even more aberrant that the "gene
map" shows.

     "Finally, tests were done on some 2000 Icelanders,
     mostly of precisely known birth-places within Iceland,
     for some twenty [blood-classification] systems. The
     results of the tests were then compared with the
     results of similar tests on the populations of the
     separate countries of the British Isles and
     Scandinavia, and of several European countries. A large
     quantity of data was fed into a computer, using a
     highly sophisticated programme, and it was anticipated
     that the result would be a clear-cut indication of
     either a Scandinavian or British origin, or perhaps a
     precise estimate of the proportion of genes derived
     from each of the two sources. Neither of these was
     found to be the case. The Icelanders showed a very
     marked difference from the populations of all other
     European countries, British, Scandinavian, and other,
     and even wide differences between the regions within
     Iceland itself."

                          (Mourant, 1983:79)

Before quoting further, I ask you to stop and ponder the
import of that last sentence: "The Icelanders showed a very
marked difference from the populations of all other European
countries, British, Scandinavian, and other, and even wide
differences between the regions within Iceland itself."

First, it is prime evidence of an exceedingly fast rate of
genetic drift.

Second, it is prime evidence that the "gene map" showing
Iceland uniformly dark green, at 0% to 1% Rh-negative, is
the artifact of having averaged the Rh-negative scores of
extremely divergent local populations.

Now to quote Mourant further:

    "Since there is no doubt that the original colonists of
    Iceland came almost exclusively from Scandinavia and the
    British Isles, there must have been great changes in the
    island gene frequencies since the colonization."

Indeed, and what might the reasons be?

    "Natural selection may have played a part, but there can
    be little doubt that we are witnessing what are mainly
    the effects of genetic drift due to sever epidemics,
    volcanic eruptions, and volcanically initiated floods.
    These have at various times over the centuries reduced
    the populations of different regions, and of Iceland as
    a whole, to levels where great accidental fluctuations
    of gene frequencies were possible, and such fluctuations
    seem indeed to have occurred so that, as we have seen,
    the frequencies observed at present bear little
    relationship to those of the original colonists."

Thus, in the words of a geneticist, we have there prime
evidence that a reduced gene pool, here due to natural
catastrophes, has translated itself into greatly increased
genetic drift, so great that "the [gene] frequencies
observed at present bear little relationship to those of the
original colonists". Not only natural catastrophes reduce
the gene pool. So can endogamy and migrations. Think how
narrow the gene pool carried by the settlers of the
Polynesian Islands might have been. Or of populations once
nearly wiped out by warfare, or disease.

How then can Cavalli-Sforza, a geneticist, hold the contrary view
that genetic drift is constant and the same for all? He
candidly admits to having ignored evidence contrary to his
thesis:

    "a judicious selection of populations makes the latter
    [hypothesis, i.e. constant universal rate of drift]
    quite probable."

Upon which, I beg to excused and I leave you to draw your own
conclusions.

FOOTNOTES

[Footnote 1] The algorithm, however, failed to reconstruct
the original six-way split. The reason is clear: when
"languages" 16 and 17 split again, they had innovated only
6% of the message (or wordlist). They then innovated
respectively 66% and 49% of that inherited message,
obliterating two-thirds and one half of the already scanty
evidence for their earlier split from the "protolanguage".

[Footnote 2] But I may venture this guess: under the true
assumption that evolutionary rates are neither constant nor
universal, it is impossible to tell which node of the
reconstructed tree is the root. Transposed into linguistic
terms: the protolanguage may reside anywhere in the tree,
and, in the absence of dated documentary evidence, it
absolutely impossible to know where. Consequently, it is
impossible to know where the centre of greatest diversity
lies, and therefore where the centre of diffusion is.

WORKS CONSULTED

Mourant, A.E.

 Blood Relations: Blood Groups and Anthropology. OUP 1983

=======================
Miguel Carrasquer Vidal
mcv@wxs.nl



Relevant Pages

  • Re: GMP vs. straight C arithmetic
    ... ordinary data structures that don't impose an additonal performance ... Side-effects are another crucial part of the language. ... > Take the tree structure example I gave earlier. ... determined at compile time, is this a compile time error? ...
    (comp.programming)
  • Linguistic prejudice in Brazil
    ... ''Undergraduate Brazilian students' discussions about language control ... the public confrontation between linguists and non linguists in the ... Universidade Estadual Paulista – UNESP, Campus de Araraquara, Brazil ...
    (sci.lang)
  • Linguistic prejudice in Brazil
    ... ''Undergraduate Brazilian students' discussions about language control ... the public confrontation between linguists and non linguists in the ... Universidade Estadual Paulista – UNESP, Campus de Araraquara, Brazil ...
    (sci.lang)
  • Inttranet "Linguists of the Year" for 2005
    ... interpreters and translators, has nominated its "Linguists of the Year" ... from thunderstorms to the human immune system, and not just language. ...
    (sci.lang.translation)
  • Re: Chemistry from a Bahai perspective
    ... Hi Matt. ... Naming conventions generally hold that the name of an item should ... A tree is named and generalized into human language so that, most often, we ...
    (soc.religion.bahai)