May 05, 2011

Dating the origin of Japanese languages with Bayesian phylogenetics

One more success story in the application of Bayesian phylogenetics to language studies. As Nicholas Wade reports:
Researchers studying the various dialects of Japanese have concluded that all are descended from a founding language taken to the Japanese islands about 2,200 years ago. The finding sheds new light on the origin of the Japanese people, suggesting that their language is descended from that of the rice-growing farmers who arrived in Japan from the Korean Peninsula, and not from the hunter-gatherers who first inhabited the islands some 30,000 years ago.

I think it's absolutely fascinating how closely the authors' date for Japonic languages corresponds to the Yayoi period. The Quentin & Atkinson way of doing language age estimation was initially met with derision by the linguistic establishment: part of it was that they did not understand it, part of it that it was introduced with a very controversial topic (Indo-European), and part of it that it triggered a deep-seated skepticism against the application of biologically-inspired methods to the study of culture.

Nonetheless, the method keeps on bringing reasonable results every time it has been applied, and it has now been adopted by many researchers of a quantitative inclination.

From the paper:
Fortunately, recent progresses in phylogenetic methods and their application in studying languages were found to provide adequate solutions for these problems [6]. Accumulating empirical evidence suggests that languages have, astonishingly, gene-like properties in numerous aspects and they also evolve by a process of descent with modification (for review, see [7]). This implies that once the shared innovations among languages are revealed by converting linguistic signals (i.e. presence or absence of homologous words) into discrete binary characters, various stochastic phylogenetic techniques for modelling biological evolution can be used to adequately reconstruct the history of language evolution. During the last decade, therefore, these techniques were quickly adopted to critically examine, and subsequently corroborate, instances of farming/language co-dispersal for Bantu [8], Indo-European [9] and Austronesian speakers [10].

What I find fascinating is the widely different manifestations of the farming/language dispersal phenomenon: the earliest attested one is the expansion of Indo-European languages from Asia Minor ~9,000 years ago, and the latest one the expansion of Japonic languages from mainland Asia ~2,400 years ago. Bantu, Austronesian, Semitic languages fill the void between these two dates. The law-like regularity with which farmers fill lands, transform the landscape, grow in numbers, and start diverging linguistically as they do so is a rare instance of mathematical regularity manifesting itself in the recent history of our species.

But, lest we get too much carried away by admiration for the farming phenomenon, let's tip our sugegasa to the Jomon hunter-gatherers of Japan, who were the partial ancestors of the modern Japanese people, and whose genetic legacy is best preserved among the Ainu (left). Again, from the paper:
If our results are correct, one surprising aspect of prehistoric Japan becomes apparent; the hunter–gatherer population, which settled in Japan around 12 000–30 000 YBP, managed to fend off the farmers for thousands of years until being abolished suddenly and dramatically with the arrival of proto-Japonic-speaking farmers around 2400 YBP. To place this in perspective, it should be noted that the hunter–gatherer societies and their languages in Europe began to be abolished by those of the farmers as early as 8500 YBP [9]. Even some of Japan's closest neighbours such as China had started agriculture since 9000 YBP [1], which progressively brought about fully fledged kingdoms equipped with metal tools fighting each other for political unification. During all this transition outside, the hunter–gatherers of Japan continued to prosper by using simple stone tools and without adopting full-scale agriculture, despite knowledge of cultivation of many crops [12]. There are probably two reasons that explain their unusually long survival. First, the population size of the hunter–gatherers may have been too large to be invaded by nearby farmers. The hunter–gatherer of Japan was perhaps one of the most affluent hunter–gatherers known to humankind, endowed with a large range of plants, animals and sea foods [46]. This vast availability of food resources is probably related to the fact that the world's oldest known pottery was made by the hunter–gatherers of Japan [47]. The development of pottery meant that unlike other hunter–gatherers around the world, they had a means to cook and store the foods that were available abundantly in their environment, and such could have triggered a population explosion to the extent that it prevented the farmers asserting any force over the hunter–gatherers for a long time. The second reason behind their long survival could be that it probably took a few thousand years for the farmers to modify rice, one of their main food sources, to grow in cold climate [48]. The archaeological evidence suggest it was not until around 3500 YBP that rice farming of warm southern China spread to the much colder Korean Peninsular [49], which is thought to be the most recent homeland of proto-Japonic-speaking farmers. A combination of these two factors might have contributed to the unusually long occupation of the hunter–gatherers in Japan.

Proceedings of the Royal Society B doi: 10.1098/rspb.2011.0518

Bayesian phylogenetic analysis supports an agricultural origin of Japonic languages

Sean Lee and Toshikazu Hasegawa

Languages, like genes, evolve by a process of descent with modification. This striking similarity between biological and linguistic evolution allows us to apply phylogenetic methods to explore how languages, as well as the people who speak them, are related to one another through evolutionary history. Language phylogenies constructed with lexical data have so far revealed population expansions of Austronesian, Indo-European and Bantu speakers. However, how robustly a phylogenetic approach can chart the history of language evolution and what language phylogenies reveal about human prehistory must be investigated more thoroughly on a global scale. Here we report a phylogeny of 59 Japonic languages and dialects. We used this phylogeny to estimate time depth of its root and compared it with the time suggested by an agricultural expansion scenario for Japanese origin. In agreement with the scenario, our results indicate that Japonic languages descended from a common ancestor approximately 2182 years ago. Together with archaeological and biological evidence, our results suggest that the first farmers of Japan had a profound impact on the origins of both people and languages. On a broader level, our results are consistent with a theory that agricultural expansion is the principal factor for shaping global linguistic diversity.



  1. I work a lot on reconstructing last common linguistic ancestors from dialect sets and it is quite tricky. It is easy to assign derived traits to the ancestor and to end up with chronologies that are too short. So I would be cautious in accepting the dating here.

  2. "Researchers studying the various dialects of Japanese have concluded that all are descended from a founding language taken to the Japanese islands about 2,200 years ago".

    Hasn't that been generally accepted for the last 40-50 years?

  3. Dienekes,
    Thanks for the post, very interesting!

    I wonder if the breakdown of the Japanese into East Asian and Northeast Asian components in your earlier Dodecad analysis is representative of the Yayoi and Jomon ancestry respectively. The Northeast Asian component is much smaller, ranging from 6 to 15% in the sample. It also seems to decrease from the northeast to the southwest of Honshu, which is probably consistent with the Yayoi settlement path. But obviously we have a very small Dodecad Japanese sample to make any strong conclusions.

    I hope you are planning to do another analysis of East-Asian and Siberian populations, as you did last November, but now with the HapMap-3 Japanese sample and the Dodecad members. Hopefully that can shed some light on the origin of the Jomon. Which Siberian populations would the Japanese cluster with in terms of their Northeast Asian component (Koryak, Altaic, Central Siberian...)?

  4. This comment has been removed by the author.

  5. @maho

    You may be right about the patterns you see and the conclusions you come up with but even I have noticed how the components change from run to run so don't let the consistent name (such as northeast Asian" fool you.

    Also in two of dienekes recent west Eurasian structure runs, one here and the other at dodcode, the British and the Irish had different relative amounts of NE European. (Actually I can't even remember if they we at different k lvls... which could defeat what I'm saying). In one run the British had more NE and in the other run they had the same.

    I wont expand since I don't know enough but that just what I noticed.

    I also would love to have the answers your seeking. I also hope they will expand this study to look at the Japanese language in the context of NE and E Asian languages, especially Korean.

  6. Not too impressive. The issue present in the hard cases for a Baysean model, including Indo-European, is the question of the extent to which there is high speed language change in period of language contact with unrelated languages or language differentiation. Some of the studies show that this is significant, producing shorter time lines.

    This study, by looking only at language change after that critical interface/differentiation period, is only looking at the easy "genetic drift" period of the Japanese language. Also, the number of calibration points within Japanese language evolution are many, because it was attested very early in its origins in writing, and because the intermedate points took place in the historic era.

  7. I wonder if the breakdown of the Japanese into East Asian and Northeast Asian components in your earlier Dodecad analysis is representative of the Yayoi and Jomon ancestry respectively.

    I doubt it, although the Northeast Asian component in both Japanese and Koreans does seem to distinguish them from the Chinese and to link them with their distant linguistic Altaic relatives.

  8. "Which Siberian populations would the Japanese cluster with in terms of their Northeast Asian component (Koryak, Altaic, Central Siberian...)?"

    I would guess that the 'Joman' are a combination of all the immigrants before the Yayoi. This is the normal pattern for immigrants. They are separate for a while but eventually incoming populations are absorbed. In some cases largely replacing the earlier inhabitants in others being absorbed by them. Even the Yayoi are now mixed with the Joman. My guess is that the Yayoi are associated with the arrival of O3 Y-haps.

  9. Japanese Phonetics?

    I always felt Japanese phonetics in general are more similar to E (M96) populations than to O (M175), is this related to the common ancestry at DE? this is of course assuming that the hunter gatherers survived in enough numbers to influence the modern Japanese at least on the autosomal level.

  10. I simply don't believe that a mathematical model can tell us anything about historical linguistics, too many x's and y's are unknowable, and many will be left out entirely.

    This research is a dead end.

  11. But since there is no language for love, there is no language for dating too. Saying from my life experience.

    BTW I found my partner from this cool site.


Stay on topic. Be polite. Use facts and arguments. Be Brief. Do not post back to back comments in the same thread, unless you absolutely have to. Don't quote excessively. Google before you ask.