January 26, 2009

Bayesian phylogenetics of languages and the timing of Austronesian settlement of the Pacific from Taiwan

The same Bayesian methodology was used by the first author to conclude that the spread of Indo-European languages began in Asia Minor during the Neolithic.

From the paper:

The innovationist "pulse-pause" scenario posits that the Austronesians originated in Taiwan around 5500 years ago and spread through the Pacific in a sequence of expansion pulses and settlement pauses (2, 4–6).


The divergence time estimates for the age of the Austronesian language family support the pulse-pause scenario (Fig. 2). The estimated root age of Austronesian across all the post–burn-in trees has a mean of 5230 years [95% highest posterior density (HPD) interval, 4750 to 5800 years B.P.). The divergence time estimates were robust across a range of calibrations and different models (28).
The concordance of Bayesian linguistics with the pulse-pause archaeological model is remarkable. So, how was the slow-boat model supported in the first place?
In contrast, proponents of the slow-boat scenario argue that the Austronesians emerged from an extensive sociocultural network of maritime exchange in Wallacea (in the region of modern day Sulawesi and the Moluccas) around 13,000 to 17,000 years B.P. based on the dating of mitochondrial lineages (11, 12).


Our estimates for the age of the Austronesian expansion are considerably younger than the deep age estimates of the slow-boat scenario (11, 12, 15). One possibility is that these deep estimates are artifacts due to problems with accurately dating genetic change. There is increasing evidence that rates of genetic change estimated over thousands of years are substantially higher than the long-term substitution rate (21). This violation of the molecular clock leads to the systematic overestimation of recent divergence times.
The problem was that a calibrated evolutionary mutation rate was used to estimate these ages, making them about 3 times older than the Austronesian expansion. Sound familiar?

Science DOI: 10.1126/science.1166858

Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement

R. D. Gray et al.


Debates about human prehistory often center on the role that population expansions play in shaping biological and cultural diversity. Hypotheses on the origin of the Austronesian settlers of the Pacific are divided between a recent "pulse-pause" expansion from Taiwan and an older "slow-boat" diffusion from Wallacea. We used lexical data and Bayesian phylogenetic methods to construct a phylogeny of 400 languages. In agreement with the pulse-pause scenario, the language trees place the Austronesian origin in Taiwan approximately 5230 years ago and reveal a series of settlement pauses and expansion pulses linked to technological and social innovations. These results are robust to assumptions about the rooting and calibration of the trees and demonstrate the combined power of linguistic scholarship, database technologies, and computational phylogenetic methods for resolving questions about human prehistory.



  1. Proto-Indoeuropean in not older than 4500 BC!
    It is ridiculous to assume proto-I.E. back to 7000 BC!!!
    That will require Celts and Indo-Aryans for example to discover and name everything through out the Neolithic with the same way and the same words!!!
    It requires them to name the wheel, three specific forms of chariot (one of it could not be older than 2500 BC,), the plough, dairy-farming products, wool products and metallurgical terms which were not present before what Andrew Sherratt called the "secondary neolithic products' revolution"!!!
    That's impossible!

    Lexicostatistics and Glottochronology are not accepted by the Linguistic scientific community as valid ways to date languages' past.

    The concept of language change is old and its history is reviewed in Hymes (1973) and Wells (1973). Glottochronology itself dates back to the mid-20th century (see Lees 1953; Swadesh 1955, 1972) An introduction to the subject is given in Embleton (1986) and in McMahon and McMahon (2005).

    Glottochronology has ever since been controversial, partly owing to issues of precision, as well as the question of whether its basis is sound (see e.g. Bergsland 1958; Bergsland and Vogt 1962; Fodor 1961; Chretien 1962; Guy 1980). These concerns have been addressed by Dobson et al (1972), Dyen (1973) and Kruskal, Dyen and Black (1973). The assumption of a single-word replacement rate can distort the divergence-time estimate when borrowed words are included (Thomason and Kaufman 1988). Chrétien purported to disprove the mathematics of the Swadesh-model. At a conference at Yale in 1971 his criticisms were shown to be invalid. The same conference saw the application of the theory to Creole language. An overview of recent arguments can be obtained from the papers of a conference held at the McDonald Institute in 2000.
    There are a lot of criticisms on the theory!

    Criticism leveled against the higher stability of lexemes in Swadesh lists alone (Haarmann 1990) misses the point, because a certain amount of losses only enables the computations (Sankoff 1970).

    Thus, in Bergsland & Vogt (1962), the authors make an impressive demonstration, on the basis of actual language data verifiable by extra-linguistic sources, that the "rate of change" for Icelandic constituted around 4% per millennium, whereas for Riksmal (Literary Norwegian) it would amount to as much as 20%. (Swadesh's proposed "constant rate" was supposed to be around 14% per millennium).

    This and several other similar examples effectively proved that Swadesh's formula would not work on all available material—a serious accusation considering that evidence that can be used to "calibrate" the meaning of L (i. e. language history recorded during prolonged periods of time) is not overwhelmingly large in the first place.

    It is highly likely that the chance of replacement is in fact different for every word or feature ("each word has its own history", among hundreds of other sources).
    This global assumption has been modified and downgraded to single words even in single languages in many newer attempts.

    A serious argument is that language change arises from socio-historical events which are of course unforeseeable and, therefore, uncomputable.

    P.S. At the end of the day we are not even able to calculate with 100% confidence our DNA molecular past and our divergence date from the big apes changes every year or two by scientists!
    Now if the molecular clock is not 100% confident when we know how it works, how reliable is to apply a "molecular clock" to languages when we can't go back in time from a certain point and after in vocabulary terms, grammar terms, phonology terms, intervention terms , etc. etc.?

  2. Antigonos, that was a lovely - if predictable - rant.

    If you would read the paper, you'd see that we absolutely did NOT do any form of lexicostatistics or glottochronology. We go into much more detail in the supplementary material. In fact, we deliberately used a method that was designed to counter the problems with assuming a molecular clock (or a simple decay rate as glottochronology does). We also discuss this on our website here.

    Simon Greenhill

  3. "Antigonos, that was a lovely - if predictable - rant."

    If your typical anglosaxonic arrogance could allow you to UNDERSTAND and not just read my post, you would have understood that my thesis is that putting the proto-Indoeuropean language to 7000 BC based on...molecular calculations which HAVE NOTHING TO DO with linguistic phenomena is phony!

    The current post claims that the Anatolian proto-I.E.hypothesis is correct based on similar molecular calculations!!!
    That is what i criticized!

  4. If your typical anglosaxonic arrogance

    Antigonos, don't act like a boor. If you have a rational coherent argument to make, please make it, but don't waste web space with the 1000-th iteration of your usual rant embellished with Wikipidia snippets without attribution.

  5. Anglosaxon arrogance!? There's no need to get so touchy! In the paper (OR the one on PIE - discussed in more detail in this paper) there were absolutely NO molecular calibrations used. In both papers historical and linguistic calibrations were used. In fact - in the Austronesian paper we spend a fair bit of time criticising molecular dating methods.


  6. Simon, besides IE and Austronesian, has your group looked at any other language families? It would be neat to try the Bayesian methodology on as many language families as there are appropriate datasets for.

  7. Dienekes, Russell Gray and Claire Holden had a paper a few years ago exploring Bantu, but unfortunately there just aren't that many big language databases out there. We're working on a few other groups at the moment with some collaborators but these are all quite a way off.


  8. Dienekes,

    The...coherent argument that Greenhill made was that i did a predictable rant?
    That's his...scientific thesis?
    When someone without knowing me and without countering back the data i provided is ironic and underestimating, then i return that back!

    Now it is not me who use snippets without attribution since as you see the people who did the surveys i referred, are all named and i provide dates and places too!
    So i can't get your point!

    Now since you speak about ranting it is you who CONTINUOUSLY present phony and non valid theories in your agony to prove that your Mediterranean Anatolian stock was the Proto-I.E.
    On the one hand you discredit Gimbutas and her work in your comments and on the other hand you use Gimbutas to show for example that Northern Europe was not inhabited by Kurgan people and that the Corded element there was not from the Steppes!!
    Or you use Gimbutas to emphasize that the ancient Classical Greece, the Byzantine Empire and "Old Europe" were based in the same roots.
    Do you accept "Old Europe"?
    Make up your mind friend!
    Do you consider Gimbutas's reconstructions imaginary as you have said to me in the past, yes or no?

    Liking it or not the Kurgan theory is the sole rational P.I.E. origin theory.
    No i am sorry because a mediterranean like you would have liked his stock to be the proto-I.E. and feel bad about not being so but we can't change facts to suite your tastes!

    I have proven you MANY times and with numerous posts my coherence and my knowledge on the I.E. matter!
    We can go all over it again if you want!!!
    It is you for example who presented, as an evidence of horse usage in Anatolia, finds of horses in a site WITHOUT THE ARCHEOLOGISTS WHO FOUND THEM TO BE SURE HOW TO INTERPRET THEM!!!
    May i suggest using your rational then that ostriches lived in Greece in the Bronze Age since we found ostrich eggs in Argolid sites?
    Don't make me laugh!
    I could go on and on about your clarity of perception and your unbiased judgment on the matter but it won't do any good!
    Besides it is crystal clear that your ethnocentric beliefs guide your steps through out your researches!

    For example you accuse White Nationalists in the face of Kemp as naive and unscientific (although White Nationalism is not accepting Kemp but only some Christian/Racist fractions like KKK, the Aryan Nations, etc. and you will never see Kemp being invited by the World Union of National Socialists or the European National Front for example), but you never criticized Poulianos and Greek Orthodox Nationalists (like the followers of the extreme right political party LA.O.S.) about their PSYCHOTIC AND HILARIOUS THESES about anthropology and history-archeology!!!

    You have in your Greek site several "rhetorical rants" against Arthur Kemp, against Karl Earlson, etc. and you do well in doing so but it would have been fair to have also rants AGAINST POULIANOS, AGAINST SERGI, AGAINST BOEV, AGAINST NICOLUCCI, AGAINST DIMOPOULOS, etc. if you were an unbiased and right researcher!
    Because as Nordicists proclaim a lot of crap so do "Mediterraneanists"!!

    Stupidities like that the Cro-magnons were pro-anthropic (Dimopoulos), that Brunn and Predmost types are just an archaic form of Mediterraneans (Nicollucci, Boev), that Mediterraneans descend from Negroid like folks of Eastern Africa (Sergi, the father of Afrocentricism), that Mediterraneans of the Neolithic wave were indigenous Greeks and they spoke...Greek back in 7000 BC (Dimopoulos), that modern humas originate from the Petralona Homo Heidelbergensis man (Poulianos), and many more STUPID AND COMPLEXIVE theories that are far worse and unscientific than Nordicism and that a very big part of the Greek population EMBRACES THESE THESES!!!

    So don't talk to me about rational and coherence!
    My articles about anthropology, history, archeology, etc. so far can prove who i am!
    Look yourself in the mirror first and then think how smart it is for someone to speak about "rational coherent arguments" when he believes that after he dies, he'll be resurrected by his God!!!

  9. Since you have such a low opinion of me, feel free to get out of my blog. Further rants will be deleted.

  10. This comment has been removed by a blog administrator.

  11. Calm down there mate. You are the one who started with the insults ("Anglosaxon arrogance").

    Now, I absolutely did understand your argument. You rehashed the standard old complaints about glottochronology. These are all valid - for glottochronology. Neither the Gray and Atkinson paper on Indo-European, or the Gray et al paper on Austronesian used ANY FORM of glottochronology. You misunderstood that, so I pointed you in the direction of some publications where we explained this in more detail.

    You may disagree with the ages we obtained - and that's fine, but you can't attack the methodology with tired old complaints about 1950s-era glottochronology.


  12. This comment has been removed by a blog administrator.

  13. Dienekes, Russell Gray and Claire Holden had a paper a few years ago exploring Bantu, but unfortunately there just aren't that many big language databases out there. We're working on a few other groups at the moment with some collaborators but these are all quite a way off.

    Thanks for the tip. I have to say that your group is doing some cutting-edge work. Big fields like linguistics tend to develop scientific orthodoxies that are hard to shake, so it's always welcome to see a fresh view with a new methodology, especially a hard quantitative one with impeccable theoretical underpinnings being used to tackle these problems.

  14. Thanks! :) We've been arguing for a while now that these methods have a lot to offer fields like linguistics/anthropology, and hope that this shows through. Unfortunately there seems to be a lot of knee-jerk rejections of these approaches rather than detailed discussion of them. However, time will tell, eh?


  15. "the language trees place the Austronesian origin in Taiwan approximately 5230 years ago and reveal a series of settlement pauses and expansion pulses linked to technological and social innovations".

    I've been trying to get that through to several people. Perhaps they might now accept it.

  16. Looks like Dienekes has thwarted freedom of speech. Instead of telling off Antigonos, how about rebutting him.

  17. Rebut what? The unattributed Wikipedia article?

    Anyone who wants to exercise their "freedom of speech" in this manner, may start their own blog or website and speak to their heart's content.

  18. Dienekes what does it matter where did i find these studies?
    Is it different if i found them in a library, magazine, TV channel, or Wikipedia?
    Since i cited the studies anyone can go to these studies and read them!
    Why is it bad if i found them in Wikipedia?
    I admit that i was overreacting to Simon and i won't to apologize for that but his answer towards me had an irony and i felt offended.

  19. For anyone interested here's my take on the subject:


    You'll notice the close similarity between the map presented here and the southern end of my map 6.

    I suggest that the first stage of the migration, pulse 1, was mostly through uninhabited islands, including many that had become uninhabited with rising sea level.

    Pulse 2 was a product of an improved sail and pulse 3, actually earlier than pulse 2, brought some members of Austronesian back to Taiwan from the Philippines. I deal with pulse 2 in an earlier essay called "Polynesian Origins".

    The reason for these particular essays is that I believe that we can learn a great deal about earlier aspects of human evolution through examining the Austronesian expansion.


Stay on topic. Be polite. Use facts and arguments. Be Brief. Do not post back to back comments in the same thread, unless you absolutely have to. Don't quote excessively. Google before you ask.