Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

New IE tree from Gray and Atkinson

14 views
Skip to first unread message

Bobby D. Bryant

unread,
Nov 29, 2003, 5:17:20 PM11/29/03
to

Here is the approximate branching structure of IE offered by the authors,
modulo any errors on my part. (If my indentation is not preserved, this
will be useless...)

IE (-8700)
Hittite
Other (-7900)
Tocharian (two listed, split -1700)
Other (-7300)
Armenio-Greek (no date)
Armenian (two listed)
Greek (several listed, split -800)
Other (-6900)
Albanio-Indo-Iranian (no date)
Albanian (several listed, split -600)
Indo-Iranian (-4600)
Indic (several, with some sub-structure, split -2900)
Iranian (several, with some sub-structure, split -2500)
Other (-6500)
Balto-Slavic (-3400)
Baltic (three listed)
Slavic (several, with some sub-structure, split -1300)
Germano-Italo-Celtic (-6100)
Celtic (several, with some sub-structure, split -2900)
Germano-Italic (-5500)
Germanic (-1750)
West Germanic (several, with some sub-structure)
North Germanic (several, with some sub-structure)
Italic (several, with some sub-structure, split -1700)

Comments:

o The labels are mine. "Other" is just a substitute for really awkward
descriptions such as "Balto-Slavo-Germano-Italo-Celtic".

o The dates indicate their calculation for when the forks took place.
E.g. the -8700 for "IE" isn't really a date for PIE, but their date for
when the fork that split "Hittite" from "Other" at the top level of
the tree.

o Notice that they have some very substantive splits on 400 year
intervals, -7300, -6900, -6500, -6100. Presumably an artifact of
interpolation?

o It appears that they mostly work with living languages, though there
are several old ones sprinkled in (Hittite, Tocharian A/B, etc.)

o Hittite is represented by a single language in their study, though we
know there was a family of closely related languages that we call
the Anatolian family.

o We know there was also an East Germanic family, but no exemplars were
used in their study. (Gothic is the most familiar.)

o Linguists will lament the missing dead languages as evidence that
could be used for an important sanity check on the method. (For
example, the Greek dialects listed split at 800 YBP, with no other
date on that branch of the tree all the way back to 7300 YBP. If
Linear B and the Classical dialects had been included, where would
their splits have fallen in that range? Similarly, the Romance
languages are shown to split up at 1700 YBP, and the Italic from the
Germanic at 5500 YBP, but where would Latin and the other known
Italic languages have fallen? Similarly for other obvious languages
such as Sanskrit, Old Persian, and Gothic.)

o Still not having read the article, I don't know what the heck some of
their "languages" refer to. E.g. "Armenian Mod" is surely Modern
Armenian, but it shares a branch with "Armenian List". A number of
other leaves on the tree are "List" items too. Greek is listed with
"Mod", "MD", "ML", "D", and "K" variants, and some of the other
languages also have two or three forms listed.

o Some linguists have claimed the existence of an Italo-Celtic group,
but others have demurred. This study rejects the Italo-Celtic
hypothesis, claiming that Germanic is more closely related to Italic
instead. (But it does claim that Celtic is the closest relation to
their Germano-Italic.)

I apologize in advance for any typos or thinkos...

--
Bobby Bryant
Austin, Texas

Bobby D. Bryant

unread,
Nov 29, 2003, 5:38:06 PM11/29/03
to
On Sat, 29 Nov 2003 16:17:20 -0600, Bobby D. Bryant wrote:

> Albanio-Indo-Iranian (no date)
> Albanian (several listed, split -600) Indo-Iranian (-4600)
> Indic (several, with some sub-structure, split -2900)
> Iranian (several, with some sub-structure, split -2500)

If your newsreader displays that like mine does, notice the missing LF/CR
between Albanian and Indo-Iranian:

Albanio-Indo-Iranian (no date)
Albanian (several listed, split -600)
Indo-Iranian (-4600)
Indic (several, with some sub-structure, split -2900)
Iranian (several, with some sub-structure, split -2500)

Jacques Guy

unread,
Nov 30, 2003, 1:04:37 PM11/30/03
to
Bobby D. Bryant wrote:
>
> Here is the approximate branching structure of IE offered by the authors

I am just back from looking for a certain e-mail on my old computer.
Alas, I did not find it.

Several months ago I received an e-mail from someone in New Zealand,
mentioning a co-authored article in Nature that he'd had published.
He was asking for the source code of GLOTSIM, a program I wrote as
part of the GLOTTO package (you'll find it at garbo.uwasa.fi),
which lets you grow a language family in vitro. I sent him the
stuff, including an article of mine (also part of that package)
demonstrating that the basic tenets of glottochronology are
rubbish, and that it is impossible to reconstruct the root of
a phylogenetic tree, whatever the method used for its reconstruction.

I am certain that that Gray fellow is that person: there aren't many
articles on comparative linguistics published in Nature, and I did
a search of their archives.

I imagine that the article says nothing of the impossibility
of reconstructing a rooted tree, and nothing again of how
the relative retention rates of languages can be computed.
Of course. It's sooooo much more uninteresting to admit:
"can't do it. No evidence."

What a pity I deleted that e-mail. Rubbing the miserable cunt's
nose in it would have afforded me great pleasure.

The article the little cunt in question did mention was about
using an exceedingly computationally expensive best-fit method to
find the shortest tree. The method, apparently, is used by
biologists on the DNA sequence. It is mindless brute force and
requires hours of computation since, given N terminal nodes, it
examines all possible trees. Even if you arbitrarily
restrict the possibilities to binary trees, like another cretin (*)
recently came trumpeting about on this newsgroup, that involves
examining un sacré foutu number of trees, pardon my French.

* Can't find him in DejaNews, though, only some comments (do
a search on "polytomy")

Peter T. Daniels

unread,
Nov 29, 2003, 7:01:03 PM11/29/03
to

Tocharian as the very first (post-Hittite) defector? No way!
--
Peter T. Daniels gram...@att.net

Brian M. Scott

unread,
Nov 29, 2003, 9:46:15 PM11/29/03
to
On Sun, 30 Nov 2003 10:04:37 -0800, Jacques Guy
<jg...@alphalink.com.au> wrote:

[...]

>Even if you arbitrarily
>restrict the possibilities to binary trees, like another cretin (*)
>recently came trumpeting about on this newsgroup, that involves
>examining un sacré foutu number of trees, pardon my French.

>* Can't find him in DejaNews, though, only some comments (do
>a search on "polytomy")

Stephen C. Carlson.

Brian

Acke Ackspett

unread,
Nov 30, 2003, 2:14:15 AM11/30/03
to

Why not?

Acke

Bobby D. Bryant

unread,
Nov 30, 2003, 2:57:35 AM11/30/03
to
On Sun, 30 Nov 2003 08:14:15 +0100, Acke Ackspett wrote:

> Peter T. Daniels wrote:
>
>> Tocharian as the very first (post-Hittite) defector? No way!
>
> Why not?

It actually conforms to my pet theory, namely that the Anatolian and
Tocharian languages are descended from the periphery of an archic IE
sprachbund, whereas the other languages spread from coreward at a later
date, bearing innovations such as the more elaborate case system
traditionally ascribed to PIE. (Notice that I am implicitly adopting the
wave model rather than the tree model, at least at the base layer of PIE
and a hypothetical post-PIE IE sprachbund. The well documented peripheral
movements of e.g. Celtic and Indo-Iranian in late prehistory would likely
have generated genuine cladistic splits, if such did not already exist by
then.)

I'm rather skeptical of the authors' Germano-Italic, but I don't even balk
at that as much as I do at the dates. The authors claim that their model
supports the "Out of Anatolia" theory of the IE homeland, and I just can't
see Anatolia as having an IE affiliation before ~2000 BCE. "We"
(excluding the authors, of course) just know too much about the language
situation in Anatolia at ~2000 BCE to allow such a thing. With a
sprinkling of IE names showing up in the Hattic lands around that time (as
shown by the correspondence of the Assyrian merchant colonies), you would
have to believe that PIE was 'invented' in a Hattic context. And if you
try to date everything back to ~7000 BCE you have to pretend you believe
that these proto-Anatolian speakers hung around as a linguistic minority
for 5000+ years before coming into their own. I suppose you could spin a
yarn, but you'd have a hard time spinning one that convinces me.

As I mentioned elsewhere in the thread, it would be nice if someone would
put in some more dead languages of known dates and re-run the authors'
algorithms, to see whether the result simultaneously (a) retains the
structure and dates of the published tree, and (b) gives plausible dates
for the added languages. I would particularly like to see Linear B and
the Classical Greek dialects added, to make sure the authors' methodology
doesn't ascribe recent dates to them. (Frankly, I'm somewhat troubled
that the authors didn't provide dates for some of the very major splits.
If I had been a reviewer, I would have asked for a correction on that
before publishing. Leaving them out accidentally smacks of carelessness;
leaving them out non-accidentally raises even bigger questions.)

On the plus side, we like hypotheses that make falsifiable predictions.
Their hypotheses (that the methodology and the resulting tree are both
'correct') predict that you _can_ add in the missing languages and get
results that are consistent with my (a) and (b) above. Has anyone got the
time/resources/interest to put the authors' hypotheses to the test? (And
fill in the dates missing from the published report?)

Acke Ackspett

unread,
Nov 30, 2003, 4:48:07 AM11/30/03
to

Have you read the article itself? It would be interesting to know how
much they have made adjustments for loan-words and Sprachbunds, which
probably could "pollute" the method. Looking at English and French
vocabulary, you could get the impression that these languages are very
closely related, even though a brief look at the history of the
languages shows that their roots are quite far apart. This might explain
the "Germano-Italic" group. It could perhaps also explain the dates.

Acke

Peter T. Daniels

unread,
Nov 30, 2003, 8:46:27 AM11/30/03
to
Acke Ackspett wrote:
>
> Bobby D. Bryant wrote:
> > On Sun, 30 Nov 2003 08:14:15 +0100, Acke Ackspett wrote:
> >
> >
> >>Peter T. Daniels wrote:
> >>
> >>
> >>>Tocharian as the very first (post-Hittite) defector? No way!
> >>
> >>Why not?

See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
Ryan & Pitman, incidentally, is on pp. 1-3).

It's only 5 pages!

More relevant would be a comparison with Dyen, Kruskal, & Black, TAPS
82/5 (1992), which was designed to test lexicostatistical (not
glottochronological) technique by applying it to modern IE (so no
Anatolian or Tocharian), so that it would be useful in evaluating the
technique as applied to most of the families for which it's used.

It's unlikely that we have much of the 200-word list for Hittite and
maybe for Tocharian anyway, so how could they be included at all?

Acke Ackspett

unread,
Nov 30, 2003, 10:48:16 AM11/30/03
to
Peter T. Daniels wrote:
> Acke Ackspett wrote:
>
>>Bobby D. Bryant wrote:
>>
>>>On Sun, 30 Nov 2003 08:14:15 +0100, Acke Ackspett wrote:
>>>
>>>
>>>
>>>>Peter T. Daniels wrote:
>>>>
>>>>
>>>>
>>>>>Tocharian as the very first (post-Hittite) defector? No way!
>>>>
>>>>Why not?
>
>
> See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
> Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
> Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
> Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
> Ryan & Pitman, incidentally, is on pp. 1-3).
>

Will do next time I come to a town, where they have a library that might
have them. Until then, I still have no clue why not.

Acke

Brian M. Scott

unread,
Nov 30, 2003, 12:35:46 PM11/30/03
to
On Sun, 30 Nov 2003 10:48:07 +0100, Acke Ackspett <ac...@free.fr>
wrote:

[...]

>Have you read the article itself? It would be interesting to know how
>much they have made adjustments for loan-words and Sprachbunds, which
>probably could "pollute" the method.

According to the write-up in Der Spiegel, they tried to weed out
of the data anything that might be a borrowing rather than a true
cognate:

Gray und Atkinson aber glauben, mit ihrer Methoden
diese Probleme gelöst zu haben. Zusätzlich sicherten
sie ihre Analyse durch die Entfernung zweifelhafter
Cognate ab, bei denen nicht sicher war, ob sie auf
einen gemeinsamen Ursprung zurückgingen oder direkt
aus einer Fremdsprache entlehnt waren. Dies habe,
schreiben sie, wahrscheinlich zu einer Unterschätzung
des Alters der indogermanischen Sprachenfamilie geführt.

The article in the International Herald Tribune gave a little
information on the technique itself:

Gray, who was joined by Quentin Atkinson, another
researcher in his department, fed a database of
information on cognates into a computer, along with
14 dates for language splits that are known from the
historical record. The computer then generated a large
series of possible family trees for the languages, as
well as timings for the various splits.

I had not previously seen the bit about using known dates.

[...]

Brian

Carmen L. Abruzzi

unread,
Nov 30, 2003, 5:38:28 PM11/30/03
to
On 11/30/03 5:46 AM, in article 3FC9F4...@worldnet.att.net, "Peter T.
Daniels" <gram...@worldnet.att.net> wrote:

More like 3 pages. The first page has only the summary/introduction, and
the last page is just the acknowledgements, the references take up 1/3 of
the 4th page.


>
> More relevant would be a comparison with Dyen, Kruskal, & Black, TAPS
> 82/5 (1992), which was designed to test lexicostatistical (not
> glottochronological) technique by applying it to modern IE (so no
> Anatolian or Tocharian), so that it would be useful in evaluating the
> technique as applied to most of the families for which it's used.
>
> It's unlikely that we have much of the 200-word list for Hittite and
> maybe for Tocharian anyway, so how could they be included at all?

They say:


To facilitate reconstruction of some of the oldest language
relationships, we added three extinct Indo-European languages, thought
to fit near the base of the tree (Hittite, Tocharian A and Tocharian
B). Word form and cognacy judgements for all three languages were made
on the basis of multiple sources to ensure reliability.

No indication what these "multiple sources" are.

Peter T. Daniels

unread,
Nov 30, 2003, 11:13:09 PM11/30/03
to
Acke Ackspett wrote:
>
> Peter T. Daniels wrote:
> > Acke Ackspett wrote:
> >
> >>Bobby D. Bryant wrote:
> >>
> >>>On Sun, 30 Nov 2003 08:14:15 +0100, Acke Ackspett wrote:
> >>>
> >>>
> >>>
> >>>>Peter T. Daniels wrote:
> >>>>
> >>>>
> >>>>
> >>>>>Tocharian as the very first (post-Hittite) defector? No way!
> >>>>
> >>>>Why not?
> >
> >
> > See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
> > Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
> > Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
> > Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
> > Ryan & Pitman, incidentally, is on pp. 1-3).
> >
>
> Will do next time I come to a town, where they have a library that might
> have them. Until then, I still have no clue why not.

Nu, how much do you already know about Tocharian?

I've just received a .pdf of the Nature article in question, and a
companion piece by a science writer. I'll read them soon.

Jacques Guy

unread,
Dec 1, 2003, 6:40:06 PM12/1/03
to
Carmen L. Abruzzi wrote:

> They say:

> To facilitate reconstruction of some of the oldest language
> relationships, we added three extinct Indo-European languages, thought
> to fit near the base of the tree (Hittite, Tocharian A and Tocharian
> B). Word form and cognacy judgements for all three languages were made
> on the basis of multiple sources to ensure reliability.

> No indication what these "multiple sources" are.

Well, here is something I found on aus.history (yes, it has
percolated to the antipodes!)

"In striking agreement with the Anatolian hypothesis, our analysis
of a matrix of 87 languages with 2,449 lexical items produced an
estimated age range for the initial Indo-European divergence
of between 7,800 and 9,800 years BP."

2,449 lexical items in 87 languages? Two thousand four hundred
and forty-nine? Pull the other one. And who did the cognate
recognition?

"between 7,800 and 9,800 years BP" is typical of
the bogus dates churned out by good ol' time
glottochronology, a.k.a. lexicostatistics, à la
Sarah Gudschinsky, Isidore Dyen, et al.

The source was:

"http://www.nature.com/cgi-taf/DynaPage.taf?file=/nature/journal/v426/n6965/abs/nature02029_fs.html&dynoptions=doi1070197857"

(I _thought_ I had already posted this link here, but
I can find no trace of it. Oh, well, bis repetita placent)

Peter T. Daniels

unread,
Dec 1, 2003, 12:53:17 AM12/1/03
to
Jacques Guy wrote:
>
> Carmen L. Abruzzi wrote:
>
> > They say:
>
> > To facilitate reconstruction of some of the oldest language
> > relationships, we added three extinct Indo-European languages, thought
> > to fit near the base of the tree (Hittite, Tocharian A and Tocharian
> > B). Word form and cognacy judgements for all three languages were made
> > on the basis of multiple sources to ensure reliability.
>
> > No indication what these "multiple sources" are.
>
> Well, here is something I found on aus.history (yes, it has
> percolated to the antipodes!)
>
> "In striking agreement with the Anatolian hypothesis, our analysis
> of a matrix of 87 languages with 2,449 lexical items produced an
> estimated age range for the initial Indo-European divergence
> of between 7,800 and 9,800 years BP."
>
> 2,449 lexical items in 87 languages? Two thousand four hundred
> and forty-nine? Pull the other one. And who did the cognate
> recognition?

That's a bit over 28 words per language. Did ANYONE referee the
article??

> "between 7,800 and 9,800 years BP" is typical of
> the bogus dates churned out by good ol' time

> glottochronology, a.k.a. lexicostatistics, Ä… la


> Sarah Gudschinsky, Isidore Dyen, et al.
>
> The source was:
>
> "http://www.nature.com/cgi-taf/DynaPage.taf?file=/nature/journal/v426/n6965/abs/nature02029_fs.html&dynoptions=doi1070197857"
>
> (I _thought_ I had already posted this link here, but
> I can find no trace of it. Oh, well, bis repetita placent)

Acke Ackspett

unread,
Dec 1, 2003, 1:21:15 AM12/1/03
to
Peter T. Daniels wrote:
> Acke Ackspett wrote:
>
>>Peter T. Daniels wrote:
>>
>>>Acke Ackspett wrote:
>>>
>>>
>>>>Bobby D. Bryant wrote:
>>>>
>>>>
>>>>>On Sun, 30 Nov 2003 08:14:15 +0100, Acke Ackspett wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Peter T. Daniels wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Tocharian as the very first (post-Hittite) defector? No way!
>>>>>>
>>>>>>Why not?
>>>
>>>
>>>See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
>>>Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
>>>Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
>>>Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
>>>Ryan & Pitman, incidentally, is on pp. 1-3).
>>>
>>
>>Will do next time I come to a town, where they have a library that might
>>have them. Until then, I still have no clue why not.
>
>
> Nu, how much do you already know about Tocharian?

Not much. Like most people nowadays, I don't speak it fluently. There
are numerous problems, one of which that many of the texts we have, both
of Tocharian A and Tocharian B, seem to be translations from Sanskrit -
and we don't know how idiomatic they are. It's a centum language that is
geographically in the wrong place. People have claimed to find
similarities between Tocharian and several groups of European languages,
but none so convincing that it can be safely said to relate to any of
them direclty.

> I've just received a .pdf of the Nature article in question, and a
> companion piece by a science writer. I'll read them soon.

Looking forward to your feedback.

Acke

Bobby D. Bryant

unread,
Dec 1, 2003, 3:19:59 AM12/1/03
to
On Mon, 01 Dec 2003 15:40:06 -0800, Jacques Guy wrote:

> 2,449 lexical items in 87 languages? Two thousand four hundred and
> forty-nine? Pull the other one. And who did the cognate recognition?
>
> "between 7,800 and 9,800 years BP" is typical of the bogus dates churned
> out by good ol' time glottochronology, a.k.a. lexicostatistics, à la
> Sarah Gudschinsky, Isidore Dyen, et al.

It happens that their primary source of data is "Dyen et al.", a data file
that's listed as being on line at
<http://www.ntu.edu.au/education/langs/ielex/IE-DATA1>, with the three
dead languages added from unspecified sources.

Jacques Guy

unread,
Dec 2, 2003, 12:31:54 AM12/2/03
to
Bobby D. Bryant wrote:

> It happens that their primary source of data is "Dyen et al.", a data file
> that's listed as being on line at
> <http://www.ntu.edu.au/education/langs/ielex/IE-DATA1>, with the three
> dead languages added from unspecified sources.


Like André Haudricourt told me 20 years and some ago: "C'est curieux,
Dyen est juif, et en général les juifs sont des gens intelligents,
mais Dyen est un parfait imbécile". Which tallied with what
I knew of the chap. Challenged on the statistical assumptions
of glottochronology, he countered: "Don't talk to me about
statistics. Talk to my co-authors. _I_ know nothing of statistics,
it does not concern _me_". Pers. com. Grand Bali Beach Hotel,
Sanur, January 1981 (you bet I remember! it's not everyday
you hear such an asinine reply).

Bobby D. Bryant

unread,
Dec 1, 2003, 5:21:57 AM12/1/03
to

Unfortunately, I don't know any of the principals.

BTW, is there a standard work on "why not glottochronology" / "why not
lexicostatistics" ?

Bobby D. Bryant

unread,
Dec 1, 2003, 5:17:32 AM12/1/03
to
On Mon, 01 Dec 2003 07:21:15 +0100, Acke Ackspett wrote:

> Peter T. Daniels wrote:

>> Nu, how much do you already know about Tocharian?
>
> Not much. Like most people nowadays, I don't speak it fluently. There
> are numerous problems, one of which that many of the texts we have, both
> of Tocharian A and Tocharian B, seem to be translations from Sanskrit -
> and we don't know how idiomatic they are. It's a centum language that is
> geographically in the wrong place. People have claimed to find
> similarities between Tocharian and several groups of European languages,
> but none so convincing that it can be safely said to relate to any of
> them direclty.

Interestingly, the two Tocharian languages supposedly group with the
Anatolian language on lots of isoglosses. (Including the famous
centum/satem isogloss, though I don't see any reason to give it more
weight than any other.)

Jacques Guy

unread,
Dec 2, 2003, 1:50:11 AM12/2/03
to
Bobby D. Bryant wrote:

> BTW, is there a standard work on "why not glottochronology" / "why not
> lexicostatistics" ?

Bergsland and Vogt, 1962 (in Current Anthropology). Actually,
Lees' demonstration of the constant rate of change of languages
is also a very nice demonstration that the rate is not
constant at all, and of how to make figures mean the
contrary of what they mean, and of how linguists fell for
it, Dell Hymes first and foremost. As for Dyen, I don't
think he ever fell dans la potion magique: he was born in
it. Ah, found them!

1953 Lees, Robert B. The Basis of Glottochronology. Language
29/2:113:127 (But I don't think Lees did it on purpose, he
was just your common ignoramus).

1962 Bergsland, Knut and Hans Vogt. On the Validity of
Glottochronology. Current Anthropology 3/2:115-153.
(The only serious article on the subject for many years).

And, of course, the classic:

1956 Gudschinsky, Sarah C. The ABC's of Lexicostistics
(Glottochronology) Word 12/2:175:210.

Almost on par with Zecharia Sitchin's ramblings, but
sorry, Sarah, still no cigar, you'll have to do better,
good ol' Zecharia still beats you to it.

Bobby D. Bryant

unread,
Dec 1, 2003, 5:53:32 AM12/1/03
to
On Mon, 01 Dec 2003 04:17:32 -0600, Bobby D. Bryant wrote:

> Interestingly, the two Tocharian languages supposedly group with the
> Anatolian language on lots of isoglosses.

BTW, "on" = "with respect to". I.e., on the same side of the line.

Peter T. Daniels

unread,
Dec 1, 2003, 7:26:34 AM12/1/03
to

Name some of them, and explain why Douglas Q. Adams doesn't agree with
you?

Miguel Carrasquer

unread,
Dec 1, 2003, 10:02:12 AM12/1/03
to

Adams himself suggests 1sg. *-wi (Tocharian + Anatolian). A phonological
isogloss (shared archaism or innovation?) may be the lack of voiced stops
in both Anatolian and Tocharian.

Adams on the whole thinks that Tocharian is closest to Germanic, although
"[t]he relative lack of common isoglosses suggests that the pre-Tocharian
dialect(s) of Proto-Indo-European may have occupied a somewhat isolated
position vis-à-vis the other groups."

This is consistent with the view that Tocharian was the second IE group to
branch off (after Anatolian, before Germanic).

=======================
Miguel Carrasquer Vidal
m...@wxs.nl

Jacques Guy

unread,
Dec 2, 2003, 10:58:30 AM12/2/03
to
Miguel Carrasquer wrote:

> Adams himself suggests 1sg. *-wi (Tocharian + Anatolian). A phonological
> isogloss (shared archaism or innovation?) may be the lack of voiced stops
> in both Anatolian and Tocharian.

Could it be an artifact of the writing system? Like there being
only five vowels in Italian and none in Phoenician?

Miguel Carrasquer

unread,
Dec 1, 2003, 5:36:37 PM12/1/03
to
On Tue, 02 Dec 2003 07:58:30 -0800, Jacques Guy <jg...@alphalink.com.au>
wrote:

>Miguel Carrasquer wrote:

Tocharian was written in a derivative of the Brahmi script, which was able
to distinguish between D, T, DH and TH (the Brahmi letter <dh(a)> was not
needed and used to write Tocharian /tä/, with a vowel absent from
Sanskrit).

Hittite, Palaic and CLuwian were written in cuneiform (CLuwian = cuneiform
Luwian), a writing system that also could distinguish D from T in most
cases. The cuneiform Anatolian scripts do not take advantage of this, and
use D- and T-signs interchangeably. The difference between PIE *t (etc.)
vs. *d and *dh (etc.) is instead reflected in medial position by doubled
spelling of the consonant (-DD- or -TT-, interchangeably). We know these
were real geminates, because they block lengthening of those stressed
vowels (í, ú) wich otherwise lengthen in an open syllable only. In initial
position, the spelling cannot reflect the difference, but it's likely that
*t-, *d- and *dh- had merged in initial position anyway (to /t/).

Danny Loss

unread,
Dec 1, 2003, 9:14:42 PM12/1/03
to
b.s...@csuohio.edu (Brian M. Scott) wrote in message news:<3fca2823...@enews.newsguy.com>...

>
> The article in the International Herald Tribune gave a little
> information on the technique itself:
>
> Gray, who was joined by Quentin Atkinson, another
> researcher in his department, fed a database of
> information on cognates into a computer, along with
> 14 dates for language splits that are known from the
> historical record. The computer then generated a large
> series of possible family trees for the languages, as
> well as timings for the various splits.
>
> I had not previously seen the bit about using known dates.
>
> [...]
>
> Brian

One good test of their technique would be to plug in the cognate data
without providing any known dates and see if their algorithms generate
the correct dates for splits evidenced in the historical record. To
anyone who's actually read the article: did they do this to show the
validity of their technique?

Danny

Danny Loss

unread,
Dec 1, 2003, 9:15:43 PM12/1/03
to
b.s...@csuohio.edu (Brian M. Scott) wrote in message news:<3fca2823...@enews.newsguy.com>...

>

> The article in the International Herald Tribune gave a little
> information on the technique itself:
>
> Gray, who was joined by Quentin Atkinson, another
> researcher in his department, fed a database of
> information on cognates into a computer, along with
> 14 dates for language splits that are known from the
> historical record. The computer then generated a large
> series of possible family trees for the languages, as
> well as timings for the various splits.
>
> I had not previously seen the bit about using known dates.
>
> [...]
>
> Brian

One good test of their technique would be to plug in the cognate data

Peter T. Daniels

unread,
Dec 2, 2003, 6:27:06 PM12/2/03
to
Acke Ackspett wrote:
>
> Peter T. Daniels wrote:

> >>>>>>>Tocharian as the very first (post-Hittite) defector? No way!
> >>>>>>
> >>>>>>Why not?
> >>>
> >>>See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
> >>>Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
> >>>Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
> >>>Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
> >>>Ryan & Pitman, incidentally, is on pp. 1-3).
> >>
> >>Will do next time I come to a town, where they have a library that might
> >>have them. Until then, I still have no clue why not.
> >
> > Nu, how much do you already know about Tocharian?
>
> Not much. Like most people nowadays, I don't speak it fluently. There
> are numerous problems, one of which that many of the texts we have, both
> of Tocharian A and Tocharian B, seem to be translations from Sanskrit -
> and we don't know how idiomatic they are. It's a centum language that is
> geographically in the wrong place. People have claimed to find
> similarities between Tocharian and several groups of European languages,
> but none so convincing that it can be safely said to relate to any of
> them direclty.

Hamp puts Toch. in his "NWIE," which is a group that excludes Anatolian,
Armenian-Greek, Indo-Iranian, and Italo-Celtic (and some of the minor
epigraphic remains), viz. mainly Gmc and Balto-Slavic.

> > I've just received a .pdf of the Nature article in question, and a
> > companion piece by a science writer. I'll read them soon.
>
> Looking forward to your feedback.

Here it is. I'm Attaching a Text Only With Line Breaks file, which
should simply show up as text. If it doesn't, I'll do it again by
cut-and-pasting.

Nature IE tree article

Peter T. Daniels

unread,
Dec 2, 2003, 6:28:02 PM12/2/03
to
It showed up readably in my system; did anyone net get the text?

Bobby D. Bryant

unread,
Dec 2, 2003, 6:32:45 PM12/2/03
to

I don't have primary sources to hand, but I'm sure you'll remember that
Mallory reproduces a diagram from Antilla (1972) showing 24 isoglosses, of
which only four separate Hittite and Tocharian. (Compare vs. 2 separating
Baltic and Slavic, 2 separating E and W Germanic, 9 separating Hittite and
Greek, etc.) Unfortunately Mallory doesn't supply the key to the numbered
isoglosses, so I suppose we'll have to let Adams and Antilla duke it out.

Peter T. Daniels

unread,
Dec 2, 2003, 7:28:07 PM12/2/03
to

I've seen many an Indo-Europeanist scratch their head over that chart of
Anttila's. Unfortunately that was the textbook Gene Gragg chose for
Historical Linguistics in 1973! (Do you want a transcription of the
isogloss list?)

Bobby D. Bryant

unread,
Dec 2, 2003, 9:12:35 PM12/2/03
to

If it's convenient, yes please.

Peter T. Daniels

unread,
Dec 3, 2003, 12:24:42 AM12/3/03
to

1. centum | satem [right]
2. -ss- | -st-, -tt- [right]
3. ao@ | a, âô | ô [inside]
4. eao | a [inside]
5. s | h [inside]
6. CVRC | CRVC [inside]
7. k^W | p [inside]
8. e- | 0 'past' [left, outside]
9. -osyo 'genitive' [right, inside]
10. -r | -i 'present' [right, outside]
11. -m- | -bh- 'case marker' [below]
12. -to- | -mo- 'ordinal' [below]
13. -u 'imperative' [inside]
14. proti | poti 'preposition' [inside]
15. secondary endings (without no. 10 -i) [below]
16. feminine nouns with masculine endings [inside]
17. -ad 'ablative' | 'genitive' [inside]
18. new tense system from perfect [inside]
19. umlaut [inside]
20. -ww-, -jj- | stop + w, j [outside]
21. -ggj- | -ddj- [right] (no. 20)
22. laryngeals as h's [inside]
23. uncontracted reflexes of sequence *yH [inside]
24. unit pronouns | particles + enclitic pronouns [inside]

Don't ask me to explain any of it!

Acke Ackspett

unread,
Dec 3, 2003, 12:57:24 AM12/3/03
to
Peter T. Daniels wrote:

> Acke Ackspett wrote:

>>Looking forward to your feedback.
>
>
> Here it is. I'm Attaching a Text Only With Line Breaks file, which
> should simply show up as text. If it doesn't, I'll do it again by
> cut-and-pasting.

Got it. Thanks!

> ------------------------------------------------------------------------
>
> What's wrong with Gray & Atkinson, "Language-tree divergence
> times support the Anatolian theory of Indo-European origin"
> Nature 426 (11/27/03): 435-39?
>
In the following I can only comment your comments. I still have not seen
the article in its entirety.

> From the point of view of the linguist, a great deal. For
> starters, the article seems to assume what it is apparently
> trying to prove: that statistical methods developed for
> biological cladistics have something to tell us about
> language classification. (At least, that's what the companion
> essay, "Trees of life and of language," pp. 391-92, seems to
> claim; it's by David B. Searls, a "bioinformatics" researcher
> who works at a drug company.)

I think you are too harsh here, even though you're mainly right. There
is a big difference on how statistics can be applied to biology vs.
linguistics. However, development in one area can often (perhaps not in
this case) be usefully modifed for the other one.

> Secondly, the article is not about language. It's impossible
> to tell what data the authors used. In the abstract they
> refer to "a matrix of 87 languages with 2,449 lexical items."
> At the foot of col. 436a it's "a matrix of 87 Indo-European
> languages with 2,449 cognate sets coded as discrete binary
> characters." Under "Methods: Data and coding," col. 438a,
> it's "The presence or absence of words from each cognate set
> was coded as '1' or '0', respectively, to produce a binary
> matrix of 2,449 cognates in 87 languages."
>
> Now maybe to a pair of psychologists a "lexical item" is a
> "cognate set" is a "cognate," but to linguists those are
> three completely different things.

Absolutely right!

> If, as they claim, they used the 200-word Swadesh list, there
> should have been exactly 200 cognate sets, with exactly
> 17,400 lexical items (minus a few, since the source of their
> data, Dyen, Kruskal, and Black, had to leave a few slots
> blank).

... if the test should claim to be complete. However, even with a
limited set, the results can have some value. Which value it actually
has, is, as you correctly point out elsewhere, impossible to tell.

> Since DK&B compiled their data set to test lexicostatistical
> techniques in a way that would be useful to linguists
> investigating families with no written data from ages past,
> they used only modern spoken languages (G & A don't explain
> how their method identified Indo-Iranian but DK&B's method
> didn't). G & A were unhappy with this, so they "added three
> extinct IE languages, thought to fit near the base of the

> tree (Hittite, Tocharian A and Tocharian B). Word form and
> cognacy judgements for all three languages were made on the

> basis of multiple sources to ensure reliability" (438a). No
> hint is given of what these "multiple sources" were, let
> alone what the data and "cognacy judgements" were.
>
> They claim they dealt with the absense of sufficient data
> from these three languages by changing "no match" to
> "uncertain," and this change had no effect on the result
> (437a).

Did they really say "no effect"? Be honest now, Peter.

> They claim that "there is considerable support for Hittite
> ... as the most appropriate root for IE" (437b). This is
> absurd, unless they are using "root" in some idiosyncratic
> way.

Well, I'm afraid I'm back to my previous "why" question. To me it is a
very, very strong to refute a "there is considerable support"-statement
with a "this is absurd"-statement. Would "there is some support" have
been absurd as well? Would "there seems to be some support" have been
absurd as well? And why? (I still haven't come to that pesky library town.)

> The rest of the text of the article concerns computational
> statistical methodology, which means nothing to me.
>
> The prettily colored diagram on p. 437 agrees in large part
> with standard charts of IE, which is not surprising, since
> the standard chart of IE seems to have been incorporated into
> their raw data -- the "Supplementary information" available
> only at the Nature website (and thus only to subscribers)
> comprises a list of 14 nodes with "age constraints." These
> are: Iberian-French, Italic-Romanian, Germanic, Welsh/Breton,
> Irish/Welsh, Indic, Iranian, Indo-Iranian, Slavic, Balto-
> Slavic, Greek split, Tocharic, Tocharian A & B, Hittite. Some
> of the "age constraints" are reasonable, some are absurdly
> broad. There's no indication of what hyphen vs. slash vs.
> ampersand means.
>
> I don't have the _slightest_ idea what figs. 1b-1e mean; they
> are all bell curve-shaped bar graphs plotting "age BP"
> against "Frequency" (it doesn't say frequency of what), and
> the four of them have different scales on the y-axis.
>
> G & A's conclusion is that their tree somehow supports what
> used to be Renfrew's suggestion about the spread of
> agriculture into Europe. G & A provide no hypothesis about
> why we should suppose that the first farmers of Europe spoke
> IE languages. Or why we should suppose that PIE is nearly
> 9000 years old -- if it were, the earliest attestations of
> it, all from the mid 2nd millennium BCE, would be far less
> similar to each other than they are, and it would be
> virtually impossible to assemble 17,000 -- or even 2500 --
> "cognates," cognate sets," or "lexical items."

Yet again, you are somewhat harsh. G & A don't have to provide any
explanation why the first farmers spoke IE. Their ambition was clearly
to apply statistics on IE languages, not to come with a full picture. If
their results (surprisingly) turn out to be reliable, others can fill
that in later. We cannot all wait to publish until everything has been
solved. If only that bird sang that sings best, the forest would be very
quiet.

Anyhow, thanks for a nice and clear summary. I appreciate both your
references, your comments and your admissions when you did not quite
follow what they described - or tried to describe.

Acke

Jacques Guy

unread,
Dec 3, 2003, 10:52:59 PM12/3/03
to
Peter T. Daniels wrote:

> They claim that "there is considerable support for Hittite
> ... as the most appropriate root for IE" (437b). This is
> absurd, unless they are using "root" in some idiosyncratic
> way.

If they are using it we do in Australia, then the meaning
is clear: phoque.

The rest of your description of the article is familiar.
I have often seen such articles when lexicostatistics
was popular. Statistical tricks of the trade applied
uncomprehendingly to data for which they were never
meant; conclusions unsupported by the evidence, even
in direct conflict with the evidence, just like
the Scientific American article by Cavalli-Sforza
and Ruehlen. Nothing new there, just a revival of
the old saw.

However, I think I have worked one mystery out:

> In the abstract they
> refer to "a matrix of 87 languages with 2,449 lexical items."
> At the foot of col. 436a it's "a matrix of 87 Indo-European
> languages with 2,449 cognate sets coded as discrete binary
> characters." Under "Methods: Data and coding," col. 438a,
> it's "The presence or absence of words from each cognate set
> was coded as '1' or '0', respectively, to produce a binary
> matrix of 2,449 cognates in 87 languages."

Better give an example. Take the word for dog. We have

English dog
German Hund
Italian cane
French chien
Spanish perro

Five languages, four cognate sets:

1. dog
2. Hund
3. cane, chien
4. perro

Take "water" now: two more cognate sets:

5. water, Wasser
6. acqua, eau, agua.

So, six cognate sets so far, and that is
how, I suspect, you can get 2,449 cognate
sets out of 200 lexical items in 87
language.

And this, I guess, is the "binary matrix"

Languages
Cognate
sets: English German Italian French Spanish

1. dog 1 0 0 0 0
2. Hund 0 1 0 0 0
3. cane etc. 0 0 1 1 0
4. perro 0 0 0 0 1
5. water etc. 1 1 0 0 0
6. acqua etc. 0 0 1 1 1


I can see no other obvious way of accounting for that
strange figure of 2,449. And yes, that is how it was
done 30~40 years ago when you had to wring out
every bit's worth out of a byte: you can fit your
2,447 cognate sets in 87 languages in... 2,447*87/8 =
26,612 bytes. Then you need 87*86/2 = 3,741 bytes to
hold the cognate counts. Would fit on an old Commodore
with a cassette drive I think.

Peter T. Daniels

unread,
Dec 3, 2003, 8:08:15 AM12/3/03
to

Nothing in the entire page of the article (the abstract is on the end of
435, 437 is almost entirely taken up by Fig. 1), most of 438 is "Method"
and the references) suggests that they "modified," usefully or
otherwise, any of the statistical packages they applied.

> > Secondly, the article is not about language. It's impossible
> > to tell what data the authors used. In the abstract they
> > refer to "a matrix of 87 languages with 2,449 lexical items."
> > At the foot of col. 436a it's "a matrix of 87 Indo-European
> > languages with 2,449 cognate sets coded as discrete binary
> > characters." Under "Methods: Data and coding," col. 438a,
> > it's "The presence or absence of words from each cognate set
> > was coded as '1' or '0', respectively, to produce a binary
> > matrix of 2,449 cognates in 87 languages."
> >
> > Now maybe to a pair of psychologists a "lexical item" is a
> > "cognate set" is a "cognate," but to linguists those are
> > three completely different things.
>
> Absolutely right!
>
> > If, as they claim, they used the 200-word Swadesh list, there
> > should have been exactly 200 cognate sets, with exactly
> > 17,400 lexical items (minus a few, since the source of their
> > data, Dyen, Kruskal, and Black, had to leave a few slots
> > blank).
>
> ... if the test should claim to be complete. However, even with a
> limited set, the results can have some value. Which value it actually
> has, is, as you correctly point out elsewhere, impossible to tell.

Not according to Swadesh's glottochronology, which they claim to apply.
The absurd time depth would seem to come from all those blanks in
Hittite and the Tocharians (since the standard dates for all the other
splits were included among the data in the first place).

> > Since DK&B compiled their data set to test lexicostatistical
> > techniques in a way that would be useful to linguists
> > investigating families with no written data from ages past,
> > they used only modern spoken languages (G & A don't explain
> > how their method identified Indo-Iranian but DK&B's method
> > didn't). G & A were unhappy with this, so they "added three
> > extinct IE languages, thought to fit near the base of the
> > tree (Hittite, Tocharian A and Tocharian B). Word form and
> > cognacy judgements for all three languages were made on the
> > basis of multiple sources to ensure reliability" (438a). No
> > hint is given of what these "multiple sources" were, let
> > alone what the data and "cognacy judgements" were.
> >
> > They claim they dealt with the absense of sufficient data
> > from these three languages by changing "no match" to
> > "uncertain," and this change had no effect on the result
> > (437a).
>
> Did they really say "no effect"? Be honest now, Peter.

"We tested this possibility by recoding apparently absent cognates as
uncertainties (absent or present) and re-running the analyses. Although
divergence-time estimates decreased slightly, the effect was only
small." Since when is "only small" a technical term in statistics? Isn't
there a regulation that either you report the significance or you don't
mention the effect?

> > They claim that "there is considerable support for Hittite
> > ... as the most appropriate root for IE" (437b). This is
> > absurd, unless they are using "root" in some idiosyncratic
> > way.
>
> Well, I'm afraid I'm back to my previous "why" question. To me it is a
> very, very strong to refute a "there is considerable support"-statement
> with a "this is absurd"-statement. Would "there is some support" have
> been absurd as well? Would "there seems to be some support" have been
> absurd as well? And why? (I still haven't come to that pesky library town.)

What the hell does "Hittite is the most appropriate root for IE" mean?

Their "conclusion" has ABSOLUTELY NOTHING to do with what they
investigated. Even if their "conclusion," that PIE dates to 8700 BP,
were correct, how would it connect to the former Renfrew proposal any
better than to any other proposal for the location of a PIE-speaking
community? Of all the multitudes of proposals for said location, why do
they feel they should choose between exactly two of them, viz.,
Gimbutas's and Renfrew's?

BTW, did I mention anywhere that there is no such thing as "Kurgan
people," but that a kurgan is a type of burial mound? You've seen the
abstract, which refers to "Kurgan horsemen," as if they think "Kurgan"
is an adjective for an ethnonymic noun *Kurg.

> Anyhow, thanks for a nice and clear summary. I appreciate both your
> references, your comments and your admissions when you did not quite
> follow what they described - or tried to describe.

"Not quite follow"??

Look at the title of the paper. Explain to me how the title fits the
paper.

Peter T. Daniels

unread,
Dec 3, 2003, 8:12:33 AM12/3/03
to

Yet "the program was run ten times using four concurrent Markov chains.
Each run generated 1,300,000 trees from a random starting phylogeny. On
the basis of an autocorrelation analysis, only every 10,000th tree was
sampled to ensure that consecutive samples were independent. A 'burn-in'
period of 300,000 trees for each run was used to avoid sampling trees
before the run had reached convergence. ... For each analysis a total of
1,000 trees were sampled and rooted with Hittite."

Could you do _that_ on your old Commodore with a cassette drive?

Whyever would you want to?

Jacques Guy

unread,
Dec 4, 2003, 3:52:23 AM12/4/03
to
Peter T. Daniels wrote:

> Yet "the program was run ten times using four concurrent Markov chains.
> Each run generated 1,300,000 trees from a random starting phylogeny. On
> the basis of an autocorrelation analysis, only every 10,000th tree was
> sampled to ensure that consecutive samples were independent. A 'burn-in'

> period of 300,000 trees for each run was used to avoid...

Oh shit! You generate 13,000,000 trees, then discard 3,000,000 and
finally throw out 99.99% of the rest without even looking.
The brute-force approach indeed. That is exactly what
that fellow who had e-mailed me had described, except that he
had not mentioned "Markov chains" and discarding... well...
99.9923077% of the stuff.


[snip]

> Could you do _that_ on your old Commodore with a cassette drive?

No(*). In those days, computers lacked brute force. People had
to use their brains.


(*) Well, yes. It would take time, but it would be possible. I have
seen sillier things done. On a DEC-KL10 at ANU. A simulation that
ran for _months_ when IBM Research had published an algorithm
that would have done it in 30 minutes. But everybody was happy:

1. It must be important since it required so much computer
resources.
2. "Look at what we have to run! We need more money to upgrade
our computing facilities!"

Bobby D. Bryant

unread,
Dec 3, 2003, 3:55:45 PM12/3/03
to
On Wed, 03 Dec 2003 05:24:42 +0000, Peter T. Daniels wrote:

> Don't ask me to explain any of it!

Thanks, Peter.

Peter T. Daniels

unread,
Dec 3, 2003, 4:56:50 PM12/3/03
to
Jacques Guy wrote:
>
> Peter T. Daniels wrote:

I realized that if this is what it means, then there are a little over a
dozen cognate sets, on average, for each of the 200 Swadesh items. Does
that seem reasonable?

Peter T. Daniels

unread,
Dec 3, 2003, 4:58:52 PM12/3/03
to
Bobby D. Bryant wrote:
>
> On Wed, 03 Dec 2003 05:24:42 +0000, Peter T. Daniels wrote:
>
> > Don't ask me to explain any of it!
>
> Thanks, Peter.

That could be taken the wrong way ...

Jacques Guy

unread,
Dec 4, 2003, 12:41:53 PM12/4/03
to
Peter T. Daniels wrote:

> I realized that if this is what it means, then there are a little over a
> dozen cognate sets, on average, for each of the 200 Swadesh items. Does
> that seem reasonable?

I have no idea whatsoever. Even if they had used Austronesian, with
which
I am much more familier, I would still have no idea, short of
slapping together a bit of Euphoria to look at existing cognate lists.

Bobby D. Bryant

unread,
Dec 4, 2003, 1:04:28 AM12/4/03
to
On Wed, 03 Dec 2003 21:58:52 +0000, Peter T. Daniels wrote:

> Bobby D. Bryant wrote:
>>
>> On Wed, 03 Dec 2003 05:24:42 +0000, Peter T. Daniels wrote:
>>
>> > Don't ask me to explain any of it!
>>
>> Thanks, Peter.
>
> That could be taken the wrong way ...

Heh. It took me a while to get it.

Herman Rubin

unread,
Dec 5, 2003, 1:28:17 PM12/5/03
to
In article <3FCEAF...@alphalink.com.au>,

Jacques Guy <jg...@alphalink.com.au> wrote:
>Peter T. Daniels wrote:

................

>Better give an example. Take the word for dog. We have

>English dog
>German Hund
>Italian cane
>French chien
>Spanish perro

>Five languages, four cognate sets:

>1. dog
>2. Hund
>3. cane, chien
>4. perro

I believe that the English hound and the German
Hund are cognate with cane and chien.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Jacques Guy

unread,
Dec 6, 2003, 11:50:16 AM12/6/03
to
Herman Rubin wrote:

> Jacques Guy <jg...@alphalink.com.au> wrote:

> >Better give an example. Take the word for dog. We have
>
> >English dog
> >German Hund
> >Italian cane
> >French chien
> >Spanish perro

> >Five languages, four cognate sets:

> >1. dog
> >2. Hund
> >3. cane, chien
> >4. perro

> I believe that the English hound and the German
> Hund are cognate with cane and chien.

So do I, but that is not relevant to the puzzle
which was: whence those "2,449 cognate sets"
in 87 languages? Now, let's add in 'hound'
just for demonstration's sake. We get:

Language:
Set: English German Italian French Spanish

1 (dog) 1 0 0 0 0
2 (hound) 1 1 0 0 0
3 (cane) 0 0 1 1 0
4 (perro) 0 0 0 0 1

The only difference is that the sum of each column,
which previously was always 1, can now be anything,
up to the number of cognate sets.

But the matrix-building process remains the same.
As to what they did with those "errant" columns,
I have no idea. There is more than one way to skin
a matrix.

Jacques Guy

unread,
Dec 6, 2003, 12:06:45 PM12/6/03
to
I stuffed up that binary matrix. I've only had my
first cup of coffee and I am not properly awake.

Instead of:

> Language:
> Set: English German Italian French Spanish
>
> 1 (dog) 1 0 0 0 0
> 2 (hound) 1 1 0 0 0
> 3 (cane) 0 0 1 1 0
> 4 (perro) 0 0 0 0 1

Read:

> Language:
> Set: English German Italian French Spanish
>
> 1 (dog) 1 0 0 0 0

> 2 (cane) 1 1 1 1 0
> 3 (perro) 0 0 0 0 1

In my first post, I took the first word
that popped up to mind, and gave more than one
cognate set, and I just overlooked *kwon. So when I
just answered "So do I" I was still thinking
'hound' and 'Hunt', not *kwon.

Peter T. Daniels

unread,
Dec 5, 2003, 7:26:15 PM12/5/03
to
Herman Rubin wrote:
>
> In article <3FCEAF...@alphalink.com.au>,
> Jacques Guy <jg...@alphalink.com.au> wrote:
> >Peter T. Daniels wrote:
>
> ................
>
> >Better give an example. Take the word for dog. We have
>
> >English dog
> >German Hund
> >Italian cane
> >French chien
> >Spanish perro
>
> >Five languages, four cognate sets:
>
> >1. dog
> >2. Hund
> >3. cane, chien
> >4. perro
>
> I believe that the English hound and the German
> Hund are cognate with cane and chien.

Swadesh did worry about what to do when you knew that A had a cognate
with B but it wasn't the current semantic equivalent. Eng. "hound"
doesn't mean the same as Ger. "Hund," and you're supposed to disregard
"hound" and use "dog."

Felix Tilley

unread,
Dec 6, 2003, 12:21:41 AM12/6/03
to
Fix your fucking clock. You are not at -0800.


In article <3FD20C...@alphalink.com.au>, Sat, 06 Dec 2003 10:06:45
-0700, "Jacques Guy" <jg...@alphalink.com.au> wrote:
Date: Sat, 06 Dec 2003 09:06:45 -0800
From: Jacques Guy <jg...@alphalink.com.au>
Reply-To: jg...@alphalink.com.au
Organization: rather disorganized
X-Mailer: Mozilla 3.0 (Win95; I; 16bit)
Newsgroups: sci.lang
Subject: Nom d'un chien! (Re: New IE tree from Gray and Atkinson)
References: <pan.2003.11.29....@mail.utexas.edu>
<3fcadd7b$0$27030$626a...@news.free.fr> <3FCD1F...@worldnet.att.net>
<3FCEAF...@alphalink.com.au> <bqqio1$1s...@odds.stat.purdue.edu>
<3FD208...@alphalink.com.au>
NNTP-Posting-Host: d33-ds1-mel.alphalink.com.au
X-Trace: news 1070662007 202.161.99.161 (6 Dec 2003 09:06:47 +1100)
Lines: 28
Path:

Richard Herring

unread,
Dec 8, 2003, 7:11:58 AM12/8/03
to
In message <vt2pr6h...@news.supernews.com>, Felix Tilley
<fti...@localhost.localdomain> writes

>Fix your fucking clock. You are not at -0800.

Why do you care?

Any thoughts about the Great Tone Shift, BTW?

--
Richard Herring

Herman Rubin

unread,
Dec 9, 2003, 12:24:09 PM12/9/03
to
In article <3FD208...@alphalink.com.au>,

>> Jacques Guy <jg...@alphalink.com.au> wrote:

But if we accept that hound is cognate with cane,
this changes it to

1 (dog) 1 0 0 0 0

2 (hound) 1 1 1 1 0


4 (perro) 0 0 0 0 1

>The only difference is that the sum of each column,
>which previously was always 1, can now be anything,
>up to the number of cognate sets.

>But the matrix-building process remains the same.
>As to what they did with those "errant" columns,
>I have no idea. There is more than one way to skin
>a matrix.

Herman Rubin

unread,
Dec 9, 2003, 12:32:40 PM12/9/03
to
In article <3FD122...@worldnet.att.net>,

................

However, when it comes to the origin of the language, that
the original "hound" was replaced by "dog", which I believe
comes from a particular breed, is irrelevant to English
coming from a Germanic source.

Peter T. Daniels

unread,
Dec 9, 2003, 2:29:47 PM12/9/03
to

It is, however, relevant for the notion of "lexical retention," which is
what glottochronology was operating on.

Jacques Guy

unread,
Dec 10, 2003, 11:21:02 AM12/10/03
to
Herman Rubin wrote:

[snip all]

because I have moving back up (down?) towards the _root_
of this discussion _tree_.

I have received a photocopy of the complete article,
courtesy of someone who certainly would wish to remain
unknown although he/she/it/they did not specifically
request so.

I cannot work out what Gray and Atkinson precisely did,
nor how they did it precisely. The information is missing.

However, there is just enough in an ancillary table
on the Nature site, publicly accessible, to figure out that
part of their results are the fruit of begging the
question, and the rest (circa 80%) is at the mercy of the
"glottochronological clock", which is as accurate
as a sundial at night:

"http://www.nature.com/nature/journal/v426/n6965/extref/nature02029-s1.doc"

It's only 35K: 14 "dates of splits" conjectured by Sheila
Embleton (on the basis on datable historical evidence).

I don't know how my pirate correspondent came across that doc file,
but do download it and see.

In conclusion, their article cannot be convincingly rebutted
since it gives insufficient details of what they did. At any
rate, you don't rebut something published in Nature. They
are good at asking for a time-consuming rewrite in order
to ignore it.

Convenient.

(Need at least another cup of coffee. Took the time to
proof-read my post, and it was chock-a-block full of
typos and missing words. Might be some left still).

Peter T. Daniels

unread,
Dec 9, 2003, 6:35:18 PM12/9/03
to
Jacques Guy wrote:
>
> Herman Rubin wrote:
>
> [snip all]
>
> because I have moving back up (down?) towards the _root_
> of this discussion _tree_.
>
> I have received a photocopy of the complete article,
> courtesy of someone who certainly would wish to remain
> unknown although he/she/it/they did not specifically
> request so.
>
> I cannot work out what Gray and Atkinson precisely did,
> nor how they did it precisely. The information is missing.
>
> However, there is just enough in an ancillary table
> on the Nature site, publicly accessible, to figure out that
> part of their results are the fruit of begging the
> question, and the rest (circa 80%) is at the mercy of the
> "glottochronological clock", which is as accurate
> as a sundial at night:
>
> "http://www.nature.com/nature/journal/v426/n6965/extref/nature02029-s1.doc"
>
> It's only 35K: 14 "dates of splits" conjectured by Sheila
> Embleton (on the basis on datable historical evidence).

Embleton is credited with only the first three; all the items except
"Irish/Welsh" are attributed to Gamkreidze & Ivanov or a website called
"IE Chronology."

> I don't know how my pirate correspondent came across that doc file,
> but do download it and see.

See the very end of p. 438. Only subscribers can access it.

> In conclusion, their article cannot be convincingly rebutted
> since it gives insufficient details of what they did. At any
> rate, you don't rebut something published in Nature. They
> are good at asking for a time-consuming rewrite in order
> to ignore it.
>
> Convenient.
>
> (Need at least another cup of coffee. Took the time to
> proof-read my post, and it was chock-a-block full of
> typos and missing words. Might be some left still).

See your first line.

0 new messages