IE (-8700)
Hittite
Other (-7900)
Tocharian (two listed, split -1700)
Other (-7300)
Armenio-Greek (no date)
Armenian (two listed)
Greek (several listed, split -800)
Other (-6900)
Albanio-Indo-Iranian (no date)
Albanian (several listed, split -600)
Indo-Iranian (-4600)
Indic (several, with some sub-structure, split -2900)
Iranian (several, with some sub-structure, split -2500)
Other (-6500)
Balto-Slavic (-3400)
Baltic (three listed)
Slavic (several, with some sub-structure, split -1300)
Germano-Italo-Celtic (-6100)
Celtic (several, with some sub-structure, split -2900)
Germano-Italic (-5500)
Germanic (-1750)
West Germanic (several, with some sub-structure)
North Germanic (several, with some sub-structure)
Italic (several, with some sub-structure, split -1700)
Comments:
o The labels are mine. "Other" is just a substitute for really awkward
descriptions such as "Balto-Slavo-Germano-Italo-Celtic".
o The dates indicate their calculation for when the forks took place.
E.g. the -8700 for "IE" isn't really a date for PIE, but their date for
when the fork that split "Hittite" from "Other" at the top level of
the tree.
o Notice that they have some very substantive splits on 400 year
intervals, -7300, -6900, -6500, -6100. Presumably an artifact of
interpolation?
o It appears that they mostly work with living languages, though there
are several old ones sprinkled in (Hittite, Tocharian A/B, etc.)
o Hittite is represented by a single language in their study, though we
know there was a family of closely related languages that we call
the Anatolian family.
o We know there was also an East Germanic family, but no exemplars were
used in their study. (Gothic is the most familiar.)
o Linguists will lament the missing dead languages as evidence that
could be used for an important sanity check on the method. (For
example, the Greek dialects listed split at 800 YBP, with no other
date on that branch of the tree all the way back to 7300 YBP. If
Linear B and the Classical dialects had been included, where would
their splits have fallen in that range? Similarly, the Romance
languages are shown to split up at 1700 YBP, and the Italic from the
Germanic at 5500 YBP, but where would Latin and the other known
Italic languages have fallen? Similarly for other obvious languages
such as Sanskrit, Old Persian, and Gothic.)
o Still not having read the article, I don't know what the heck some of
their "languages" refer to. E.g. "Armenian Mod" is surely Modern
Armenian, but it shares a branch with "Armenian List". A number of
other leaves on the tree are "List" items too. Greek is listed with
"Mod", "MD", "ML", "D", and "K" variants, and some of the other
languages also have two or three forms listed.
o Some linguists have claimed the existence of an Italo-Celtic group,
but others have demurred. This study rejects the Italo-Celtic
hypothesis, claiming that Germanic is more closely related to Italic
instead. (But it does claim that Celtic is the closest relation to
their Germano-Italic.)
I apologize in advance for any typos or thinkos...
--
Bobby Bryant
Austin, Texas
> Albanio-Indo-Iranian (no date)
> Albanian (several listed, split -600) Indo-Iranian (-4600)
> Indic (several, with some sub-structure, split -2900)
> Iranian (several, with some sub-structure, split -2500)
If your newsreader displays that like mine does, notice the missing LF/CR
between Albanian and Indo-Iranian:
Albanio-Indo-Iranian (no date)
Albanian (several listed, split -600)
Indo-Iranian (-4600)
Indic (several, with some sub-structure, split -2900)
Iranian (several, with some sub-structure, split -2500)
I am just back from looking for a certain e-mail on my old computer.
Alas, I did not find it.
Several months ago I received an e-mail from someone in New Zealand,
mentioning a co-authored article in Nature that he'd had published.
He was asking for the source code of GLOTSIM, a program I wrote as
part of the GLOTTO package (you'll find it at garbo.uwasa.fi),
which lets you grow a language family in vitro. I sent him the
stuff, including an article of mine (also part of that package)
demonstrating that the basic tenets of glottochronology are
rubbish, and that it is impossible to reconstruct the root of
a phylogenetic tree, whatever the method used for its reconstruction.
I am certain that that Gray fellow is that person: there aren't many
articles on comparative linguistics published in Nature, and I did
a search of their archives.
I imagine that the article says nothing of the impossibility
of reconstructing a rooted tree, and nothing again of how
the relative retention rates of languages can be computed.
Of course. It's sooooo much more uninteresting to admit:
"can't do it. No evidence."
What a pity I deleted that e-mail. Rubbing the miserable cunt's
nose in it would have afforded me great pleasure.
The article the little cunt in question did mention was about
using an exceedingly computationally expensive best-fit method to
find the shortest tree. The method, apparently, is used by
biologists on the DNA sequence. It is mindless brute force and
requires hours of computation since, given N terminal nodes, it
examines all possible trees. Even if you arbitrarily
restrict the possibilities to binary trees, like another cretin (*)
recently came trumpeting about on this newsgroup, that involves
examining un sacré foutu number of trees, pardon my French.
* Can't find him in DejaNews, though, only some comments (do
a search on "polytomy")
Tocharian as the very first (post-Hittite) defector? No way!
--
Peter T. Daniels gram...@att.net
[...]
>Even if you arbitrarily
>restrict the possibilities to binary trees, like another cretin (*)
>recently came trumpeting about on this newsgroup, that involves
>examining un sacré foutu number of trees, pardon my French.
>* Can't find him in DejaNews, though, only some comments (do
>a search on "polytomy")
Stephen C. Carlson.
Brian
Why not?
Acke
> Peter T. Daniels wrote:
>
>> Tocharian as the very first (post-Hittite) defector? No way!
>
> Why not?
It actually conforms to my pet theory, namely that the Anatolian and
Tocharian languages are descended from the periphery of an archic IE
sprachbund, whereas the other languages spread from coreward at a later
date, bearing innovations such as the more elaborate case system
traditionally ascribed to PIE. (Notice that I am implicitly adopting the
wave model rather than the tree model, at least at the base layer of PIE
and a hypothetical post-PIE IE sprachbund. The well documented peripheral
movements of e.g. Celtic and Indo-Iranian in late prehistory would likely
have generated genuine cladistic splits, if such did not already exist by
then.)
I'm rather skeptical of the authors' Germano-Italic, but I don't even balk
at that as much as I do at the dates. The authors claim that their model
supports the "Out of Anatolia" theory of the IE homeland, and I just can't
see Anatolia as having an IE affiliation before ~2000 BCE. "We"
(excluding the authors, of course) just know too much about the language
situation in Anatolia at ~2000 BCE to allow such a thing. With a
sprinkling of IE names showing up in the Hattic lands around that time (as
shown by the correspondence of the Assyrian merchant colonies), you would
have to believe that PIE was 'invented' in a Hattic context. And if you
try to date everything back to ~7000 BCE you have to pretend you believe
that these proto-Anatolian speakers hung around as a linguistic minority
for 5000+ years before coming into their own. I suppose you could spin a
yarn, but you'd have a hard time spinning one that convinces me.
As I mentioned elsewhere in the thread, it would be nice if someone would
put in some more dead languages of known dates and re-run the authors'
algorithms, to see whether the result simultaneously (a) retains the
structure and dates of the published tree, and (b) gives plausible dates
for the added languages. I would particularly like to see Linear B and
the Classical Greek dialects added, to make sure the authors' methodology
doesn't ascribe recent dates to them. (Frankly, I'm somewhat troubled
that the authors didn't provide dates for some of the very major splits.
If I had been a reviewer, I would have asked for a correction on that
before publishing. Leaving them out accidentally smacks of carelessness;
leaving them out non-accidentally raises even bigger questions.)
On the plus side, we like hypotheses that make falsifiable predictions.
Their hypotheses (that the methodology and the resulting tree are both
'correct') predict that you _can_ add in the missing languages and get
results that are consistent with my (a) and (b) above. Has anyone got the
time/resources/interest to put the authors' hypotheses to the test? (And
fill in the dates missing from the published report?)
Have you read the article itself? It would be interesting to know how
much they have made adjustments for loan-words and Sprachbunds, which
probably could "pollute" the method. Looking at English and French
vocabulary, you could get the impression that these languages are very
closely related, even though a brief look at the history of the
languages shows that their roots are quite far apart. This might explain
the "Germano-Italic" group. It could perhaps also explain the dates.
Acke
See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
Ryan & Pitman, incidentally, is on pp. 1-3).
It's only 5 pages!
More relevant would be a comparison with Dyen, Kruskal, & Black, TAPS
82/5 (1992), which was designed to test lexicostatistical (not
glottochronological) technique by applying it to modern IE (so no
Anatolian or Tocharian), so that it would be useful in evaluating the
technique as applied to most of the families for which it's used.
It's unlikely that we have much of the 200-word list for Hittite and
maybe for Tocharian anyway, so how could they be included at all?
Will do next time I come to a town, where they have a library that might
have them. Until then, I still have no clue why not.
Acke
[...]
>Have you read the article itself? It would be interesting to know how
>much they have made adjustments for loan-words and Sprachbunds, which
>probably could "pollute" the method.
According to the write-up in Der Spiegel, they tried to weed out
of the data anything that might be a borrowing rather than a true
cognate:
Gray und Atkinson aber glauben, mit ihrer Methoden
diese Probleme gelöst zu haben. Zusätzlich sicherten
sie ihre Analyse durch die Entfernung zweifelhafter
Cognate ab, bei denen nicht sicher war, ob sie auf
einen gemeinsamen Ursprung zurückgingen oder direkt
aus einer Fremdsprache entlehnt waren. Dies habe,
schreiben sie, wahrscheinlich zu einer Unterschätzung
des Alters der indogermanischen Sprachenfamilie geführt.
The article in the International Herald Tribune gave a little
information on the technique itself:
Gray, who was joined by Quentin Atkinson, another
researcher in his department, fed a database of
information on cognates into a computer, along with
14 dates for language splits that are known from the
historical record. The computer then generated a large
series of possible family trees for the languages, as
well as timings for the various splits.
I had not previously seen the bit about using known dates.
[...]
Brian
More like 3 pages. The first page has only the summary/introduction, and
the last page is just the acknowledgements, the references take up 1/3 of
the 4th page.
>
> More relevant would be a comparison with Dyen, Kruskal, & Black, TAPS
> 82/5 (1992), which was designed to test lexicostatistical (not
> glottochronological) technique by applying it to modern IE (so no
> Anatolian or Tocharian), so that it would be useful in evaluating the
> technique as applied to most of the families for which it's used.
>
> It's unlikely that we have much of the 200-word list for Hittite and
> maybe for Tocharian anyway, so how could they be included at all?
They say:
To facilitate reconstruction of some of the oldest language
relationships, we added three extinct Indo-European languages, thought
to fit near the base of the tree (Hittite, Tocharian A and Tocharian
B). Word form and cognacy judgements for all three languages were made
on the basis of multiple sources to ensure reliability.
No indication what these "multiple sources" are.
Nu, how much do you already know about Tocharian?
I've just received a .pdf of the Nature article in question, and a
companion piece by a science writer. I'll read them soon.
> No indication what these "multiple sources" are.
Well, here is something I found on aus.history (yes, it has
percolated to the antipodes!)
"In striking agreement with the Anatolian hypothesis, our analysis
of a matrix of 87 languages with 2,449 lexical items produced an
estimated age range for the initial Indo-European divergence
of between 7,800 and 9,800 years BP."
2,449 lexical items in 87 languages? Two thousand four hundred
and forty-nine? Pull the other one. And who did the cognate
recognition?
"between 7,800 and 9,800 years BP" is typical of
the bogus dates churned out by good ol' time
glottochronology, a.k.a. lexicostatistics, à la
Sarah Gudschinsky, Isidore Dyen, et al.
The source was:
(I _thought_ I had already posted this link here, but
I can find no trace of it. Oh, well, bis repetita placent)
That's a bit over 28 words per language. Did ANYONE referee the
article??
> "between 7,800 and 9,800 years BP" is typical of
> the bogus dates churned out by good ol' time
> glottochronology, a.k.a. lexicostatistics, ą la
> Sarah Gudschinsky, Isidore Dyen, et al.
>
> The source was:
>
> "http://www.nature.com/cgi-taf/DynaPage.taf?file=/nature/journal/v426/n6965/abs/nature02029_fs.html&dynoptions=doi1070197857"
>
> (I _thought_ I had already posted this link here, but
> I can find no trace of it. Oh, well, bis repetita placent)
Not much. Like most people nowadays, I don't speak it fluently. There
are numerous problems, one of which that many of the texts we have, both
of Tocharian A and Tocharian B, seem to be translations from Sanskrit -
and we don't know how idiomatic they are. It's a centum language that is
geographically in the wrong place. People have claimed to find
similarities between Tocharian and several groups of European languages,
but none so convincing that it can be safely said to relate to any of
them direclty.
> I've just received a .pdf of the Nature article in question, and a
> companion piece by a science writer. I'll read them soon.
Looking forward to your feedback.
Acke
> 2,449 lexical items in 87 languages? Two thousand four hundred and
> forty-nine? Pull the other one. And who did the cognate recognition?
>
> "between 7,800 and 9,800 years BP" is typical of the bogus dates churned
> out by good ol' time glottochronology, a.k.a. lexicostatistics, à la
> Sarah Gudschinsky, Isidore Dyen, et al.
It happens that their primary source of data is "Dyen et al.", a data file
that's listed as being on line at
<http://www.ntu.edu.au/education/langs/ielex/IE-DATA1>, with the three
dead languages added from unspecified sources.
> It happens that their primary source of data is "Dyen et al.", a data file
> that's listed as being on line at
> <http://www.ntu.edu.au/education/langs/ielex/IE-DATA1>, with the three
> dead languages added from unspecified sources.
Like André Haudricourt told me 20 years and some ago: "C'est curieux,
Dyen est juif, et en général les juifs sont des gens intelligents,
mais Dyen est un parfait imbécile". Which tallied with what
I knew of the chap. Challenged on the statistical assumptions
of glottochronology, he countered: "Don't talk to me about
statistics. Talk to my co-authors. _I_ know nothing of statistics,
it does not concern _me_". Pers. com. Grand Bali Beach Hotel,
Sanur, January 1981 (you bet I remember! it's not everyday
you hear such an asinine reply).
Unfortunately, I don't know any of the principals.
BTW, is there a standard work on "why not glottochronology" / "why not
lexicostatistics" ?
> Peter T. Daniels wrote:
>> Nu, how much do you already know about Tocharian?
>
> Not much. Like most people nowadays, I don't speak it fluently. There
> are numerous problems, one of which that many of the texts we have, both
> of Tocharian A and Tocharian B, seem to be translations from Sanskrit -
> and we don't know how idiomatic they are. It's a centum language that is
> geographically in the wrong place. People have claimed to find
> similarities between Tocharian and several groups of European languages,
> but none so convincing that it can be safely said to relate to any of
> them direclty.
Interestingly, the two Tocharian languages supposedly group with the
Anatolian language on lots of isoglosses. (Including the famous
centum/satem isogloss, though I don't see any reason to give it more
weight than any other.)
> BTW, is there a standard work on "why not glottochronology" / "why not
> lexicostatistics" ?
Bergsland and Vogt, 1962 (in Current Anthropology). Actually,
Lees' demonstration of the constant rate of change of languages
is also a very nice demonstration that the rate is not
constant at all, and of how to make figures mean the
contrary of what they mean, and of how linguists fell for
it, Dell Hymes first and foremost. As for Dyen, I don't
think he ever fell dans la potion magique: he was born in
it. Ah, found them!
1953 Lees, Robert B. The Basis of Glottochronology. Language
29/2:113:127 (But I don't think Lees did it on purpose, he
was just your common ignoramus).
1962 Bergsland, Knut and Hans Vogt. On the Validity of
Glottochronology. Current Anthropology 3/2:115-153.
(The only serious article on the subject for many years).
And, of course, the classic:
1956 Gudschinsky, Sarah C. The ABC's of Lexicostistics
(Glottochronology) Word 12/2:175:210.
Almost on par with Zecharia Sitchin's ramblings, but
sorry, Sarah, still no cigar, you'll have to do better,
good ol' Zecharia still beats you to it.
> Interestingly, the two Tocharian languages supposedly group with the
> Anatolian language on lots of isoglosses.
BTW, "on" = "with respect to". I.e., on the same side of the line.
Name some of them, and explain why Douglas Q. Adams doesn't agree with
you?
Adams himself suggests 1sg. *-wi (Tocharian + Anatolian). A phonological
isogloss (shared archaism or innovation?) may be the lack of voiced stops
in both Anatolian and Tocharian.
Adams on the whole thinks that Tocharian is closest to Germanic, although
"[t]he relative lack of common isoglosses suggests that the pre-Tocharian
dialect(s) of Proto-Indo-European may have occupied a somewhat isolated
position vis-à-vis the other groups."
This is consistent with the view that Tocharian was the second IE group to
branch off (after Anatolian, before Germanic).
=======================
Miguel Carrasquer Vidal
m...@wxs.nl
> Adams himself suggests 1sg. *-wi (Tocharian + Anatolian). A phonological
> isogloss (shared archaism or innovation?) may be the lack of voiced stops
> in both Anatolian and Tocharian.
Could it be an artifact of the writing system? Like there being
only five vowels in Italian and none in Phoenician?
>Miguel Carrasquer wrote:
Tocharian was written in a derivative of the Brahmi script, which was able
to distinguish between D, T, DH and TH (the Brahmi letter <dh(a)> was not
needed and used to write Tocharian /tä/, with a vowel absent from
Sanskrit).
Hittite, Palaic and CLuwian were written in cuneiform (CLuwian = cuneiform
Luwian), a writing system that also could distinguish D from T in most
cases. The cuneiform Anatolian scripts do not take advantage of this, and
use D- and T-signs interchangeably. The difference between PIE *t (etc.)
vs. *d and *dh (etc.) is instead reflected in medial position by doubled
spelling of the consonant (-DD- or -TT-, interchangeably). We know these
were real geminates, because they block lengthening of those stressed
vowels (í, ú) wich otherwise lengthen in an open syllable only. In initial
position, the spelling cannot reflect the difference, but it's likely that
*t-, *d- and *dh- had merged in initial position anyway (to /t/).
>
> The article in the International Herald Tribune gave a little
> information on the technique itself:
>
> Gray, who was joined by Quentin Atkinson, another
> researcher in his department, fed a database of
> information on cognates into a computer, along with
> 14 dates for language splits that are known from the
> historical record. The computer then generated a large
> series of possible family trees for the languages, as
> well as timings for the various splits.
>
> I had not previously seen the bit about using known dates.
>
> [...]
>
> Brian
One good test of their technique would be to plug in the cognate data
without providing any known dates and see if their algorithms generate
the correct dates for splits evidenced in the historical record. To
anyone who's actually read the article: did they do this to show the
validity of their technique?
Danny
>
> The article in the International Herald Tribune gave a little
> information on the technique itself:
>
> Gray, who was joined by Quentin Atkinson, another
> researcher in his department, fed a database of
> information on cognates into a computer, along with
> 14 dates for language splits that are known from the
> historical record. The computer then generated a large
> series of possible family trees for the languages, as
> well as timings for the various splits.
>
> I had not previously seen the bit about using known dates.
>
> [...]
>
> Brian
One good test of their technique would be to plug in the cognate data
> >>>>>>>Tocharian as the very first (post-Hittite) defector? No way!
> >>>>>>
> >>>>>>Why not?
> >>>
> >>>See e.g. Hamp in the Pergamon Encyclopedia of Lang & Ling (ed. Asher &
> >>>Simpson), or in Markey & Greppin, When Worlds Collide, or my review of
> >>>Mair's Bronze Age & Early Iron Age Peoples or Eastern Central Asia, in
> >>>Mair's "journal" Sino-Platonic Papers 98 (Jan 2000): 4-46 (my review of
> >>>Ryan & Pitman, incidentally, is on pp. 1-3).
> >>
> >>Will do next time I come to a town, where they have a library that might
> >>have them. Until then, I still have no clue why not.
> >
> > Nu, how much do you already know about Tocharian?
>
> Not much. Like most people nowadays, I don't speak it fluently. There
> are numerous problems, one of which that many of the texts we have, both
> of Tocharian A and Tocharian B, seem to be translations from Sanskrit -
> and we don't know how idiomatic they are. It's a centum language that is
> geographically in the wrong place. People have claimed to find
> similarities between Tocharian and several groups of European languages,
> but none so convincing that it can be safely said to relate to any of
> them direclty.
Hamp puts Toch. in his "NWIE," which is a group that excludes Anatolian,
Armenian-Greek, Indo-Iranian, and Italo-Celtic (and some of the minor
epigraphic remains), viz. mainly Gmc and Balto-Slavic.
> > I've just received a .pdf of the Nature article in question, and a
> > companion piece by a science writer. I'll read them soon.
>
> Looking forward to your feedback.
Here it is. I'm Attaching a Text Only With Line Breaks file, which
should simply show up as text. If it doesn't, I'll do it again by
cut-and-pasting.
I don't have primary sources to hand, but I'm sure you'll remember that
Mallory reproduces a diagram from Antilla (1972) showing 24 isoglosses, of
which only four separate Hittite and Tocharian. (Compare vs. 2 separating
Baltic and Slavic, 2 separating E and W Germanic, 9 separating Hittite and
Greek, etc.) Unfortunately Mallory doesn't supply the key to the numbered
isoglosses, so I suppose we'll have to let Adams and Antilla duke it out.
I've seen many an Indo-Europeanist scratch their head over that chart of
Anttila's. Unfortunately that was the textbook Gene Gragg chose for
Historical Linguistics in 1973! (Do you want a transcription of the
isogloss list?)
If it's convenient, yes please.
1. centum | satem [right]
2. -ss- | -st-, -tt- [right]
3. ao@ | a, âô | ô [inside]
4. eao | a [inside]
5. s | h [inside]
6. CVRC | CRVC [inside]
7. k^W | p [inside]
8. e- | 0 'past' [left, outside]
9. -osyo 'genitive' [right, inside]
10. -r | -i 'present' [right, outside]
11. -m- | -bh- 'case marker' [below]
12. -to- | -mo- 'ordinal' [below]
13. -u 'imperative' [inside]
14. proti | poti 'preposition' [inside]
15. secondary endings (without no. 10 -i) [below]
16. feminine nouns with masculine endings [inside]
17. -ad 'ablative' | 'genitive' [inside]
18. new tense system from perfect [inside]
19. umlaut [inside]
20. -ww-, -jj- | stop + w, j [outside]
21. -ggj- | -ddj- [right] (no. 20)
22. laryngeals as h's [inside]
23. uncontracted reflexes of sequence *yH [inside]
24. unit pronouns | particles + enclitic pronouns [inside]
Don't ask me to explain any of it!