June 27, 2009

A colorful view of the potency of skulls and the reality of race

(updated June 30)

I used distruct to create a graphical display of the clusters revealed by my 2004 model-based clustering of 2,504 human skulls on 57 metrical variates. As mentioned in the original article, these traits are enough to distinguish some human races (e.g., Caucasoids=Norse, Zalavar, Berg, Egyptian), or even individual populations (e.g., Eskimo, Buriat, or Bushmen).

Of course, some skulls don't fall in the right cluster, but this is to be expected both due to the state of the original collections (*) and due to the plasticity of the human skull that may create false associations.

But, on the whole, the clusters emerge as distinct and unmistakable entities; this level of resolution at a global scale is only possible -if at all- with hundreds of thousands of genetic markers, yet 57 linear measurements pretty much do the same trick.

One can only imagine what would be possible if someone takes a 3D scanner around the world to visit the same museum collections that Howells did several decades ago. But, perhaps, physical anthropologists have better things to do these days than discovering differences between human populations...

(*) For example, W.W. Howells noted in his work that one of the American skulls obviously belonged to a white settler.

UPDATE (June 30)

Please consult the original article for details on the populations and the methodology used. Note that K=14 is the number of clusters which maximizes the Bayes Information Criterion, but it is by no means the end of the story. For even higher K, some populations can be further separated, although some of them (e.g., Europeans) never split into fairly "clean" clusters with these 57 variables.

Below are all the results for K=2 to K=14. As with Rosenberg et al. (2002) and the work that followed it, the first split contrasts East Asians with Eurafricans. It is important to note that the pattern of successive splits should not be interpreted as a phylogeny of human populations, i.e., a history of human subdivisions.



39 comments:

Kepler said...

Dienekes,

What do you think would these measurements say about people like me? I am Venezuelan, but I think my case illustrate that of people in many other countries than the ones in America. I have a J2 haplotype from my dad, my mitochondrial DNA is sub-Saharan and yet my mom looked like a Northern European. I know several branches of my ancestors in the XIX century were in Spain. Males can only find out about two haplogroups, so I have for the moment only the concrete data about my two haplogroups. Now according to what I read from research carried out in Venezuela, over 90% of male haplogroups there are 'European' and the rest African or Native American, whereas 55%> of the mtDNA are native American and the rest European or African.
So it is likely some paternal aunt or someone on a branch higher up had a native American mtDNA profile.

I haven't carried out all those measurements, perhaps I do out of curiosity (I think it is tricky to take some measurements oneself, I will ask someone).
People look at me in Europe and they think I am 'from somewhere here', which I am not.

I even see the case in native Spaniards or Italians, where, as far as I remember, have a couple of percentage points of sub-Saharan mtDNA. Every single country has centres of clusters. There are quite some people who would fall into the "mestizo" type, obviously.
What percentage would that be? I don't think so much lower than in countries seen as "mestizo countries", only that the "races".

I see in the list you put "Peru". Is that a race?
At what stage do you cut the difference?
I don't know much but when I see the few "pure" native Venezuelans (less than 2% of the population classifying themselves to a given Indian group), they are way different from an Indian from Peru or Bolivia: nose, colour, even eyes and height. Is there a Pemon race? Or perhaps Carib versus Arawak race versus Chibcha?

I think even those people who say they don't accept the existence of "race" or something like that do acknowledge there are measurable differences, but those differences always depend on the perspective you choose.
I have no problem accepting there are races, but the thing is what is formally a "race"? It seems really such a fluffy term as language. Obviously there are languages, but: is Swiss German a language? Some say yes, some no. What is the criteria?
This is the thing. It seems criteria are always different. There are obvious clusters in genetics, but the way people build clusters will be incredibly different.
If I asked someone who agreed with you in methodology and I put both of your apart and tell you to work on the same data and produce the list of races I am sure you will always come up with a different list, even if some parts are similar.

Dienekes said...

but those differences always depend on the perspective you choose.

But, I haven't chosen a particular perspective, I just fed 57 measurements from 2,504 skulls into a clustering program.

The program could very well have produced a "mixed" picture, but it didn't; you have different populations mostly as solid blocks of color (cluster membership), showing that you can easily tell apart, e.g., a Buriat Mongol from an Eskimo, or an Egyptian from a Zulu.


I have no problem accepting there are races, but the thing is what is formally a "race"?


There are plenty of workable definitions of race, and I've offered some myself, but at the bottom of it, a race is a group of people that can be distinguished from other such groups by objective criteria (e.g., skull measurements or gene variants).

This ability to make distinctions is due to the fact that due to either geographical or cultural distance, humans don't intermarry with random other humans, but with those geographically and culturally closest to them.

In short, I believe that the clusters that emerge spontaneously time and again when people are grouped according to their genetic or phenotypic similarity deserve a name, and "race" is a perfect name for them.

The only valid objection I've seen regarding the concept of "race", is that people have misused it, speaking of a "black race" according to the one-drop rule, or thinking that Hispanics of various degrees of European/Amerindian/African admixture constitute a race and so forth.

But, that ought not lead us to abandon the useful concept of race, only to distinguish biological race with some problematic "social" races.

Andrés said...

This is not a rhetoric question: can you state briefly why the concept of race as you see it is useful?
And then: how is the attitude of those denying that concept hindering research or the like?
Thanks

Onur Dincer said...
This comment has been removed by the author.
Onur Dincer said...
This comment has been removed by the author.
Google Man said...

I am already so tired by these moronic babbles from multicultural class that I have not enough strength to reply.

Google Man said...

Perhaps the universe isn't old enough and planets still didn't come into being. Who knows. Mr. Lewontine should investigate it.
LOL

Onur Dincer said...
This comment has been removed by the author.
terryt said...

"Human groups haven't been so much separated from each other to form distinct subspecies or races".

I'm not so sure. Comparison with other species could be very useful. I've just got a beautiful book out of the library called "Wildfowl of the World". Obviously I went straight to the page dealing with our native grey duck and what do I find? First off:

"Three subspecies: New Zealand Black (or Grey) Duck ... Australian Black Duck ... Pelew Island Black (or Grey) Duck ..."

So they display some degree of geographic variation, exactly as we find with humans. But:

"The taxonomy of this species and Spot-billed Duck ... is not agreed by all authorities, some of whom lump them into one species with six subspecies. Recent ringing recoveries have shown movement of Australian Black Ducks to New Zealand, which casts doubt on the separation of those races". So there's a classification problem. But the difficulty goes even further:

"Clearly, both Pacific Black duck and Spot-billed Duck are also closely related to the Philippine Duck ..."

So there you have it. In the example of these ducks too "Most of the time we can only talk about clines from global to local levels and some structuring only at the very very local levels".

What's the difference between a 'subspecies' and a 'race'?

Onur Dincer said...
This comment has been removed by the author.
terryt said...

So you now agree that humans do form distinct subspecies or races.

Onur Dincer said...
This comment has been removed by the author.
terryt said...

Well. You readily accept that ducks (and many other species) have diversified into races or subspecies as they've spread around about, so why on earth would humans be any different? Even experts have trouble distinguishing between the three subspecies of grey duck yet none of us would have trouble distinguishing a person of West African origin from one of East Asian origin. Why the different perspective? Perhaps you believe humans were specially created and so are independent of the biological rules operating on all other species.

Onur Dincer said...
This comment has been removed by the author.
Onur Dincer said...
This comment has been removed by the author.
Maju said...

Dienekes wrote:

The program could very well have produced a "mixed" picture, but it didn't; you have different populations mostly as solid blocks of color (cluster membership), showing that you can easily tell apart, e.g., a Buriat Mongol from an Eskimo...

So does this study of yours destroy the idea of a Mongoloid race specifically? Because I was always pushed to think that Mongols and Eskimos (and Chinese and others) were all the same "Mongoloid" cluster. But this graph of you shows that while other "races" appear homogeneous (Dogon and Zulu, Egyptian and Norse, etc.), "Mongoloids" are very diverse in fact.

This may have been induced by sample size? IDK. But I'd really like to compare with lower depth K means, as this one has something like 13 clusters, while the number of major races considered traditionally is much lower (4 to 8 or so).

If we'd follow this graph we could get the Caucasoid, Negroid, Capoid and Australoid "races" more or less clear but then we'd aslo have stuff like Sinoid, Eskimoid, "true" Mongoloid (Buriats), Polynesoid, Peruoid, Satnacruzoid and others we have never heard of before.

So my question is: did you get the classical races, in particular the Mongoloid one, at any depth of your study... or did you select this high depth graph because it supported your preconceptions about clustered anthropometrical differences (regardless they are very much non-classical), discarding the rest, which may have been contradictory with such conclussions?

It'd be interesting to see the K=4 or K=6 graphs. If traditional racial anthropometry is correct, they should show the classical races quite clear at that stage already.

argiedude said...

Impressive geographically disperse clusters:
Tan: from north Europe to Egypt
Violet: from Kenya to West Africa to South Africa
Light blue: from northcentral USA to Peru
Yellow: from Japan to China to Philippines

Dienekes, I hope you're not going to leave this at this. This was an extremely interesting post and you have to make more of it. [see what I wrote at the bottom of this post]

..................

onur said:
"Human groups haven't been so much separated from each other to form distinct subspecies or races."

Humans have one of the greatest genetic differentiations of all large animal species. I think only wolves have a greater FST difference than humans. Also, your idea that humans haven't been separated long enough to form into different subspecies... I'll bet you don't have the slightest clue how long it might take for a species of large animal to differentiate into subspecies. The coyote used to only exist in southwest USA 100 years ago. In just 50 years of rapid diffusion across North America due to the extirpation of the wolf, it has already changed remarkably into a wolf-like animal, with much greater size (still smaller than a wolf), changes in limb and snout proportions (less fox-like, more wolf-like), amount of hair and its coloring (also less shaggy and more woolly, appropriate for the cold climate), even its diet, as some have started to band together and resorted to hunting white tail deer. It's become a very different animal from the skinny, solitary coyote back in Mexico, and this happened in just 50 years.


terryt said:
"You readily accept that ducks (and many other species) have diversified into races or subspecies as they've spread around about, so why on earth would humans be any different? Even experts have trouble distinguishing between the three subspecies of grey duck yet none of us would have trouble distinguishing a person of West African origin from one of East Asian origin. Why the different perspective? Perhaps you believe humans were specially created and so are independent of the biological rules operating on all other species."

That was an excellent response.


maju said:
"I'd really like to compare with lower depth K means, as this one has something like 13 clusters"

Me too. Dienekes, can you do this analysis again and show us the results for K = 3 to 10?

argiedude said...

And tell us better the geographic origin of these samples.

Onur Dincer said...
This comment has been removed by the author.
Onur Dincer said...
This comment has been removed by the author.
Dienekes said...

See updated entry.

argiedude said...

http://img15.imageshack.us/img15/6674/mapofdienekesanalysisof.jpg

Maju said...

Hey, thanks a lot, Dienekes.

I'd agree with Argiedude that the K=6 level is the one that best represents the classical races.

Even if Polinesians were never considered a race on their own right, they consistently cluster separately through all the process, from K=3 to K=12. That cannot be an effect of sample bias because there are not so many Polinesians after all.

What might be a distortion caused by sample bias is the huge apportion of classically Mongoloid types, who make up more than half of the whole sample. Noticeably they only cluster together at K=3 (to the exclusion of Polinesians). But guess this may be interesting to study this group in some detail.

A major distinction between non-Polinesian Mongoloids could possibly be made between "Siberoid" (Buryat and Eskimo) and "Sinoid" (all the rest, including Filipinos). The Siberoids cluster variedly with Sinoids or Polinesians. The third cluster would be of course Amerinds.

It is also interesting that Ainus and Andamanese do not appear as distinct clusters until the latest stages. Instead Santa Cruz (what ethnic group are these?) do cluster separately from other Amerinds early after the Amerind cluster shows up.

...

I'd say that the sooner a type appears defined in the structure, the less conservative it can be said to be (relative to the traits presumably carried by our common African ancestors).

In this sense The Polinesian-Siberoid group appears as the more innovative of all, followed closely by Mongoloid/Sinoid and then by Caucasoids.

Instead Australoids seem rather conservative, as they show up only after Capoids (an African group) do.

eurologist said...

Good insight, there.

Do you think the Polynesians are easily distinguished because of their isolation and small numbers (drift)?

The Andamanese are very interesting. They seem to be conservative enough not to know exactly where they belong (need K=14 to be separate from the Bushmen!), but interestingly first align with East Asians. Which could mean that some East Asian features are very old, or it could mean more admixture than often thought present. And, they never group with either Africa or Australia.

Maju said...

Do you think the Polynesians are easily distinguished because of their isolation and small numbers (drift)?-

Me? Honestly no idea. It may be a founder effect of some sort but certainly a marked one.

Their "relatives" (in a metrical sense) in Asia appear to be Siberoids, depending on the level. It might be that what we call "Mongoloids" nowadays are actually two different groups and that while Siberoids have converged better with Sinoids, Polynesians keep that phenotype better. It might be somewhat associated to the Y-DNA C/NO distinction - very tentatively only.

The Andamanese are very interesting. They seem to be conservative enough not to know exactly where they belong (need K=14 to be separate from the Bushmen!), but interestingly first align with East Asians. Which could mean that some East Asian features are very old, or it could mean more admixture than often thought present. And, they never group with either Africa or Australia.

I did not realize that they clustered again with Bushmen until your mention. But you're right. Bushmen also cluster early on (low levels) with Mongoloids and Andamanese share Y-DNA ancestry with some East Asians (D).

I was mentioning elsewhere that there could be an Afro-Australoid basic type, in the sense that these may have kept better ancient traits. Now I wonder if parallely to this one there might also be a Capoid-Mongoloid second basic type in that same sense, where Andamanese would belong. It'd not be the first time that Capoids are mentioned in relation with East Asian phenotype.

Maybe early Eurasians brought both African prototypes and soon diverged as they colonized East Asia, which was probably the first area colonized after South Asia. This is of course very speculative and I don't think anybody can reach to solid conclusions based only on this but it is anyhow suggestive.

Haven't mentioned it before but I'm really missing South Asian samples. East Africans would not be in excess either.

eurologist said...

It is certainly striking that Andamanese never group with either Australians nor with Polynesians, which naive thinking would suggest.

Even at K=11 they appear as a melange of almost everything. My thinking is that they originally were much more plentiful in SE Asia, and only got marginalized more recently. Otherwise you would expect the evolution of some distinct features. In pictures, they certainly look to me like the outcome, here: very mild, average Africans with just a hint of Indian and very slight, average-looking Indonesian/Thai admixture - nothing characteristic, really.

As to the Khoi-San/ Capoids, it is reasonable to assume that they too only got marginalized with the fairly recent advent of agriculture and the Bantu expansion. They would have been much more plentiful in much of East Africa, and I too find it plausible that initial migration(s) out of Africa could have easily harbored both types.

That brings up an interesting point for those looking for ancient signatures/intermixing in autosomal DNA. May be it's African, after all - just quite old African, from the South.

I also agree that the wide lack of SE Asian samples and even more European samples makes the interpretation difficult.

terryt said...

"What really matters is that there is genetic evidence that there is no or almost no gene flow between groups of a species because of some isolation for a considerable amount of time".

As Argiedude pointed out it doesn't necessarily have to be 'for a considerable amount of time'.

"if the variation within a particular species is clinal, which is the case with humans".

As it is in the case of many other species. It's often extremely difficult to decide where exactly one subspecies takes over from another. The main reason why humans more often demonstrate a clinal pattern is that technology has allowed humans to more readily cross geographic barriers that serve to isolate various populations of other species. The process of subspecies formation is exactly the same. Isolation, followed by drift and selection.

"Instead Santa Cruz (what ethnic group are these?) do cluster separately from other Amerinds early after the Amerind cluster shows up".

Possibly the Santa Cruz islands in Northern Vanuatu: Melanesian but speaking Austronesian languages. They are almost certainly, then, the product of admixture between Australoid (at least Papuan) and Sinoid (including Filipinos) people. It's generally accepted in this part of the world that Polynesians are a product of the same mixture.

"Do you think the Polynesians are easily distinguished because of their isolation and small numbers (drift)?"

Almost certainly so.

"Which could mean that some East Asian features are very old, or it could mean more admixture than often thought present".

Both probably.

"I was mentioning elsewhere that there could be an Afro-Australoid basic type, in the sense that these may have kept better ancient traits".

But the two groups are actually very different. About the only characteristic they have in common is dark skin. Apart from that Australoids are as different from Africans as are Sinoids.

"My thinking is that they originally were much more plentiful in SE Asia, and only got marginalized more recently".

Such marginalization is accepted as part of the ancient Polynesians' development, before they emerged from SE Asia. Australoid, or at least Papuan, type people are the older SE Asian inhabitants and Sinoid-type people are more recent arrivals, possibly only in the last six or seven thousand years (if not more recently). The sort of process you accept with the Khoi-San/ Capoids appears to have been quite common during our evolution.

terryt said...

I take it back. The Santa Cruz in the diagram is obviously American, not the Vanuatuan Santa Cruz.

Maju said...

But the two groups are actually very different. About the only characteristic they have in common is dark skin. Apart from that Australoids are as different from Africans as are Sinoids.

Also to my eye. Agreed but the fat that they are the last Eurasian cluster to show up as distinct from Negroid Africans may mean something.

They may be different in elements like CI but akin in other elements like prognathism or nose width (these are the ones I can spot faster, Dienekes surely can explain this better).

But to my eye Australoids look somewhat like "archaic Caucasoids", so I would have expected more affinity with this last cluster if anything. Metrics say the opposite, it seems.

I'd also like to know if Santa Cruz means the so-called Santa Cruz Indians of California (Awaswas) or some archaeological site in Santa Cruz (Patagonia, Argentina). Or maybe something else. America is full of places of that name.

Dienekes said...

Santa Cruz is off the coast of California. Again, read the original article for details of the populations.

Unknown said...

Hi Dieneke!

Nice job! I tried a Principal Components Analysis on the population means of the Howell data (z-scores, additionally I excluded incomplete variables).

The result should show similar populations close to each other and - if large discrete differences exist in the data - in distinct clouds in the scatterplot, hence clusters. Well, on the first two Principal Components are no clusters, but longish scatters, their distribution resembles your cluster model at about k=7. Intriguingly, they all "begin" at the origin of the diagram and span out into different directions up to extremes - which should depict a typological/similarity ordering. Whereas the populations at the outer extremes of those scatters are mostly geographically far away from each other, the populations at their intersection - the diagram origin - are mixed (Norse, Berg, Zalavar, but also Philipi & Sta. Cruz). The 3rd Principal is also very interesting, but Im tired of typing already... ;)

Well, to sum it up, in general the PCA is a neat addition to the cluster analysis - and by the way it seems to lead only to meaningful results, when population aggregates (such as the means) are used, a try with the individual data looked rather randomly - which again suggests, that their is no single linear development/evolution inherent in the data, hence populations, but rather either multiple linear developments or a lot of sexual interaction between the developmental branches or both.

Regards,
Jörg

Unknown said...

PS: I tried to reproduce your cluster results, using R and the mclust-function after excluding incomplete variables. The algorithm suggested also EEE, but a G of 7 or 9 (equally good). Hence, the results do look a bit different, too:


Relative spread of the populations over 14 clusters (I went for 14 in the example to compare it to your result - but the optimum of 9 clusters in this modell is obvious)

1 2 3 4 5 6 7 8 9 10 11 12 13 14
AINU 1 81 14 1 0 1 1 0 0 0 0 0 0 0
ANDAMAN 0 6 3 83 0 9 0 0 0 0 0 0 0 0
ANYANG 0 95 5 0 0 0 0 0 0 0 0 0 0 0
ARIKARA 7 12 80 0 0 0 0 0 1 0 0 0 0 0
ATAYAL 0 94 0 4 0 2 0 0 0 0 0 0 0 0
AUSTRALI 0 0 4 1 95 0 0 0 0 0 0 0 0 0
BERG 0 0 97 2 0 1 0 0 0 0 0 0 0 0
BURIAT 0 5 3 0 0 1 0 0 91 0 0 0 1 0
BUSHMAN 0 1 1 14 0 83 0 0 0 0 0 0 0 0
DOGON 0 0 0 94 1 4 0 0 0 0 0 1 0 0
EASTER I 0 3 0 0 0 0 1 95 0 0 0 0 0 0
EGYPT 0 0 91 8 0 0 0 0 0 0 1 0 0 0
ESKIMO 0 5 1 0 0 1 0 0 94 0 0 0 0 0
GUAM 0 96 2 0 0 0 2 0 0 0 0 0 0 0
HAINAN 0 93 4 2 0 0 1 0 0 0 0 0 0 0
MOKAPU 2 2 0 0 0 0 88 8 0 0 0 0 0 0
MORIORI 91 1 2 0 0 0 4 3 0 0 0 0 0 0
N JAPAN 0 93 2 1 0 2 0 0 0 0 0 0 0 1
N MAORI 40 0 0 0 0 0 20 40 0 0 0 0 0 0
NORSE 1 0 97 2 0 0 0 0 0 0 0 0 0 0
PERU 0 0 94 2 1 0 0 0 0 4 0 0 0 0
PHILLIPI 0 68 8 18 0 2 4 0 0 0 0 0 0 0
S JAPAN 0 97 2 0 0 1 0 0 0 0 0 0 0 0
S MAORI 70 0 0 0 0 0 0 30 0 0 0 0 0 0
SANTA CR 0 3 94 0 2 1 0 0 0 0 0 0 0 0
TASMANIA 0 1 3 0 92 0 1 2 0 0 0 0 0 0
TEITA 0 1 2 78 1 13 0 1 2 0 0 0 0 0
TOLAI 1 5 2 2 91 0 0 0 0 0 0 0 0 0
ZALAVAR 0 3 94 0 1 0 0 0 2 0 0 0 0 0
ZULU 0 7 3 85 3 2 0 0 0 0 0 0 0 0

(sorry for the ugly format, but HTML-tags are not allowed, for a better view, paste that table into eg Word and choose a fixed width font such as COURIER).


Well, the clusters make sense in a way, too, check out e.g. the splitting of the Maori-populations which suggests a heterogeneous mixture of samples. Otherwise, the algorithm put the Caucasians and the Native Americans together in one cluster. -
But how comes my results are not the same as yours?!

Regards,
Jörg

Maju said...

Whoa! Just put Jörg's alternative results in clean, by assigning all population to their majority cluter (only N. Maoris are bad for that) and get:

1 - Moriori, (N. Maori), S. Maori
2 - Ainu, Anyang, Atayal, Guam, Hainan, N. Japan
3 - Arikara, Berg, Egypt, Norse, Peru, Philippines, S. Japan, Sta. Cruz, Zalavar
4 - Andaman, Dogon, Teita, Zulu
5 - Australian Abor., Tasmania, Tolai,
6 - Bushmen
7 - Mokapu
8 - Easter Is., (N. Maori)
9 - Buriat, Eskimo

So not just Mongoloids are here very spread out (also but even more) but notably Amerinds and South Japanese cluster with Caucasoids.

Not sure what to think.

Dienekes said...

von.tronje, the last few clusters in your table seem underpopulated (all 0's and 1's). I have no idea what you did differently; if you do exactly what I describe in my article you will be able to replicate my results.

Unknown said...

Well, as mentioned, I´ve shown the G=14 variant in order to summarize my results. Modelling with the optimum of G=9 doesn´t depart significantly.

Dieneke, indeed I tried to reproduce your results - as a mere training for myself in statistics and R - following your very steps with one exception: I excluded the variables RFA,RPA,ROA,BSA,SBA,SLA and TBA beforehand, because they are not available for all individuals. You didn´t mention that in your txt. Otherwise it seems, you´ve excluded some populations or merged some (you´ve got n=28, but the database contains 30 populations).

Could you, maybe publish the R code you´ve run and the list of variables and populations you used?

Regards,
Jörg

Dienekes said...

There are 86 columns in the data. If we exclude the 4 that are non-numerical (id, sex, popnum, population), there are 82 left. You have excluded 7, which leaves 75. That is 75-57 = 18 more than I (and Howells 1995) used.

Dienekes said...

download.file('http://konig.la.utk.edu/howell.txt','howells.txt')
howells<-dget('howells.txt')

X <- howells[howells[,"Population"]!="S MAORI" & howells[,"Population"]!="N MAORI",]
Y <- X[,union(c(5:60),62)]

Y contains the 57 variables for the 28 populations.

The other thing that you may have done differently if you used the new version of mclust. I have not used it myself, so I don't know if it would produce a different result. You should use the mclust02 package which was the version of the software available when I wrote the article.

Unknown said...

Cool, thanx, Im on it... I obviously neglected your comment about the number of variables used, sorry.

Gaia's sister said...

I am a bit concerned here with the use of skull morphology. Given how plastic the skull is, I suspect there is a lot of noise in that data set.