separated only 200 years ago. Rate variation can be taken account of in the maximum likelihood approach, essentially by choosing trees with the minimum amount of variation necessary to fit known dates of language divergence.
The mathematical techniques for addressing both word borrowing and variation in evolution rate were available because biologists had encountered the same two problems in drawing up trees based on DNA
data. As with languages, some genes evolve at faster rates than others. And just as words may be borrowed instead of inherited, an organism may acquire genes through borrowing as well as by inheritance; bacteria, for instance, transfer packets of genes to each other, which is why they so quickly acquire genes for resistance to antibiotics. In one maximum likelihood approach currently favored by biologists, called the Bayesian Markov chain Monte Carlo method, the DNA sequences of various genes are fed into a computer that generates a large number of possible trees by which the genes might be related. The program samples the classes of tree that seem most promising (there are far too many for even the fastest computer to examine each one), and then repeats the whole process a large number of times. At each iteration there are fewer promising trees, and eventually the process will converge on a single, most probable tree to account for the data. With this powerful tree-drawing technique, Gray and his colleague Quentin Atkinson have constructed a family tree of Indo-European. For data, he relied on a 200 word Swadesh list for 84 Indo-European languages drawn up by the linguist Isidore Dyen, to which he added data from three extinct languages (Hittite and the two versions of Tokharian, known as Tokharian A and B).
Gene trees can often be anchored in real time by matching a date from the fossil record to one of the tree's branch points. The same can be done with maximum likelihood trees constructed for languages. Having found the statistically most likely tree to account for the Indo-European data, Gray then constrained certain branch points in the tree to fit attested historical dates for divergence of certain languages. Hittite must have been a separate language by 1800 BC, the date of the oldest known inscription. Greek must have been separate by 1500 BC, the date of the Linear B inscriptions. Latin and Romanian started to diverge when Roman troops withdrew south of the Danube in AD 270. Altogether Gray plugged in 14 known dates, constraining the tree to fit itself to the dates in the most statistically probable way. Because the branch lengths of the tree are proportional to elapsed time, anchoring the tree to historical events allows all the other branch points in the tree to be dated. Gray's tree was published in Nature in November 2003, with a terse description of the rather complex methodology behind its construction.269 The first reaction of many historical linguists was that he had done nothing new because his tree of Indo-European was just like theirs. But that very fact, in Gray's view, was the best possible validation of his method.
FIGURE 10.2. A GENETICIST'S TREE OF THE INDO-EUROPEAN LANGUAGE FAMILY.
A tree of Indo-European was constructed by Russell Gray and Quentin Atkinson using an advanced statistical method. Because the tree is anchored to 14 known dates of recent language origin, the dates of its ancient branch points can be estimated. Figures show the years before the present at which languages split apart.
According to the Gray-Atkinson tree, the original language, called proto-Indo-European by linguists, split 8,700 years ago into the two branches, of which the first led to Hittite and the second to all the other Indo-European languages. The early date assigned to proto-Indo-European suggests that it was the language of the people who introduced farming into Europe from the Middle East. English is a member of the Germanic group of languages, as are Dutch, Swedish and Icelandic. The Romance language family includes French, Italian and Spanish. Russian, Czech and Lithuanian are among the members of Balto-Slavic. Hittite, now extinct, was the language of the Hittite empire in what is now Turkey; Tokharian was spoken in western China.
The novel feature of his tree was not its shape but its dates. They were very different from anything the linguists had imagined. The tree showed that proto-Indo-European was spoken before 8,700 years ago, the date at which it underwent its first split, when the branch leading to Hittite split off from all the rest. This date is nearly 3,000 years older than the 5,500 to 6,000 years ago date favored by many historical linguists for the breakup of Indo-European.
Gray's dates, if correct, are somewhat revolutionary because they show the roots of Indo-European are far older than expected and that language can be traced back far deeper in time than most linguists think likely. Moreover, a reliable dating method would at last allow language change to be correlated with the information emerging from archaeology and population genetics.
Many linguists say Gray's dates can't be right, essentially because they conflict with the dates given by linguistic paleontology. But linguistic paleontology is a fuzzy technique, dependent on judgment and vulnerable to undetected borrowing and fallacious reconstructions. Gray's technique applies a sophisticated statistical method, of proven value in phylogeny, to a reliable data set, the Dyen list, which represents the fruit of Indo-European linguistic scholarship. As a pioneering approach, it may well need refinement, or turn out to have some unexpected flaw. But as compared with linguistic paleontology, it doesn't seem so obviously less credible.
Gray says he has great respect for the scholarship and methods of historical linguistics and hopes linguists will come around to taking his tree seriously, once they understand that his technique avoids the much discussed errors of glottochronology.
Using a simpler phylogenetic technique, Peter Forster, an archaeologist at the University of Cambridge, has drawn up a family tree of several Celtic languages including Gaulish, the version spoken in ancient France before the Roman conquest, as well as Welsh, Breton and Gaelic. Celtic is a major branch of Indo-European. Forster's tree implies that Indo-European had diverged around 10,000 years ago, and that Celtic
Was this article helpful?