tag:blogger.com,1999:blog-7785493.post7780611263697359744..comments2024-01-04T04:11:55.717+02:00Comments on Dienekes’ Anthropology Blog: Refinement of ancestry informative markers in Europeans (Tian et al. 2009)Dienekeshttp://www.blogger.com/profile/02082684850093948970noreply@blogger.comBlogger86125tag:blogger.com,1999:blog-7785493.post-24720931011280751482010-04-20T04:28:20.151+03:002010-04-20T04:28:20.151+03:00I'm surprised at the emphasis of the Southern ...I'm surprised at the emphasis of the Southern Italian origins of the Italian-American population in the U.S. which was common knowledge (a good part of that immigration was pre-Italian unification and was from the Kingdom of Sicily).<br /><br />The study doesn't resolve one of the more pertinent questions about Italian origins, which is whether they are closer to the Western Anatolian population (who spoke a Hittite derived Indo-European language) or the post-Mycenean Greek derived Indo-European language population. The PCA chart in a study I've seen (sorry, no cite immediately at hand) including all three puts the Italians closer to the Anatolians than the Greeks, a finding which the linguistic evidence (the Italic-Celtic languages, most notably Latin, are closer to Hittite than Greek) supports. There was an isolated colony or two in Italy with paleo-Balkan linguistic origins, but many with Italic languages in South Italy.Andrew Oh-Willekehttps://www.blogger.com/profile/02537151821869153861noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-43211130407950230542010-02-24T11:42:34.822+02:002010-02-24T11:42:34.822+02:00Totally uninterested in the Brits and their haplog...Totally uninterested in the Brits and their haplogroups.<br /><br />I am interested in SNP differences between ethnic groups.<br /><br />I have tested with 23andMe, and have transferred my data to deCODEme.<br /><br />This is what I have observed. Jews, Ashkenazim, have minor Asian admixture, as do some Iberians, and Italians. However this admixture does not effect their PCA diagram placement. At 23andMe, most Jews are clustered in the European groups, mostly with Tuscan and Bergamo Italians but often with the French. Few end up in the Middle Eastern PCA diagram, but some Southern Europeans do end up in the Middle Eastern groups. At deCODEme. Most of the Jews I am friends with, share some data, are clustered with the Italians but these Jews tend to be more Sephardic Jews. Other Eastern people like Samaritan, Anatolian Turks, Armenians, Assyrians, Israeli Jews tend to cluster on the Europe PCA further up from the Italian group heading towards the Adygei.<br /><br />As the study of Tian et al, stated that they separated Southern Europeans from Ashkenazim Jews why cannot 23andMe or deCODEme do the same?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7785493.post-27810165714334122082009-09-13T08:14:14.083+03:002009-09-13T08:14:14.083+03:00My bad. My document of reference on this issue is ...My bad. My document of reference on this issue is Capelli 2005: <a href="http://www.ucl.ac.uk/tcga/tcgapdf/capelli-CB-03.pdf" rel="nofollow">A Y chromosome census of the British islands</a>... but I was talking from memory and got the data somewhat wrong. <br /><br />In fact, the two areas more affected by the Nordic invasions (York and East Anglia) show a Y-DNA intrusion of c. 60%, but most of England is in the 40% range, with some areas well below. Anyhow, this refers only to Y-DNA and does not exclude other previous flows in earlier times. <br /><br />MtDNA is in fact more akin to mainland NW Europe but this is usually interpreted as meaning that the original population arrived largely from there (via Doggerland in the Epipaleolithic, as it's attested archaeologically for NE England and Scotland) and that the difference in Y-DNA signals a greater alien male input in the mainland than in Britain. <br /><br />In any case it was not a settlement, like Australia but something more of a conquest with some settlement of males mostly. Even the more genetically "Germanized" areas are still 40% aboriginal (at least).Majuhttps://www.blogger.com/profile/12369840391933337204noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-58821038879605217362009-09-12T21:21:19.447+03:002009-09-12T21:21:19.447+03:00Thanks Maju,
I don't think the India example ...Thanks Maju,<br /><br />I don't think the India example is very comparable, though I get your larger point. However, you have a stronger point with the comments about Y-chromosome. I was thinking specifically of the various studies that have been done on Y-chromosome, which show that the Anglo-Saxons (through the use of complex sociopolitical codes) replaced the genes of the indigenous populations in record time. I think there was one that showed that the Y-chromosomes of Friesland are indistinguishable from England today. And of course, these studies always show that the similarities do not continue into Wales or South Ireland, suggesting that this was an issue of subjugation coupled with genetic replacement.<br /><br />I found it interesting that my own DNA marker had the most 37-marker Y-chromosome matches in Netherlands versus any other place, although my patrilineage is documented to American revolution, and central England back to 1600s.<br /><br />To your point, though, I believe that the studies also showed that mtdna is much more diverse across these populations, and incorporates plenty from the subjugated populations (IOW, there is probably plenty of Pict mtdna floating around, but zero Y-chromosome).JSAhttps://www.blogger.com/profile/00681934865643964687noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-63992757471223036282009-09-12T19:23:44.953+03:002009-09-12T19:23:44.953+03:00It was subjugated but probably not so much populat...It was subjugated but probably not so much populated, Joshua. Even in paternal ancestry (Y-DNA), the most similar of all English to North Germans and Danes are only like 40% that, and some of that could be older, one could argue, from the times of Doggerland. The average English may be, by exclusively paternal ancestry, only like 20% Anglo-Saxon plus Viking. <br /><br />One thing is to conquest and another to populate. England conquered India but did not populate it, right?Majuhttps://www.blogger.com/profile/12369840391933337204noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-48315261131542744032009-09-12T04:37:55.835+03:002009-09-12T04:37:55.835+03:00"Only because they don't have England the...<b>"Only because they don't have England there -- England ought to be closer to German than Italian to Greek."</b><br /><br /><i>Why? Languages do not equal genetics. "Italians" are a heterogeneous population (possibly even more so than "English" and "Germans") because of the distinctive history and geography of Italy and the fact that southern European populations contain more genetic diversity than northern ones.</i><br /><br />Hmm, it is probably that my knowledge of the history is bad. I understood that England was populated and subjugated by germanic invaders long after Homer. However, now that I think about it, that could very well be wrong.JSAhttps://www.blogger.com/profile/00681934865643964687noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-88467130930065080942009-09-10T23:31:32.283+03:002009-09-10T23:31:32.283+03:00"Only because they don't have England the..."Only because they don't have England there -- England ought to be closer to German than Italian to Greek."<br /><br />Why? Languages do not equal genetics. "Italians" are a heterogeneous population (possibly even more so than "English" and "Germans") because of the distinctive history and geography of Italy and the fact that southern European populations contain more genetic diversity than northern ones.<br /><br />If you know anything about Italian history, you would know that large numbers of people migrated from Greece to South Italy throughout Archaic and Classical Greek times (Magna Graecia). <br />Later, during Byzantine times, after the Slavic invasion of the Greek peninsula and the Persian/Arab invasions of Asia Minor, there were yet more Greek migrations to south Italy.<br /><br />From what I can tell, most people in southern Calabria and eastern Sicily are basically Latinized Greeks. About 20% of surnames in the provinces of Reggio and Messina (surnames were only adopted after medieval times in southern Italy) are of Greek origin, and there are still a few Calabrian towns where people speak a Greek dialect (Griko).Patrick Pastorhttps://www.blogger.com/profile/09099899266500215447noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-78335388704084066282009-09-05T00:30:15.169+03:002009-09-05T00:30:15.169+03:00"Uneducated" in terms of exhibited knowl..."Uneducated" in terms of exhibited knowledge. I don't really care about people's credentials only about their exhibited grasp of concepts. <br /><br />For example, if someone sees a dendrogram and assumes it was built by UPGMA, they are revealing that they do not know that UPGMA is not the only hierarchical method of clustering that can be represented with a dendrogram. Or, if someone thinks that every tree representation of genetic distance represents a phylogeny. Or, if they don't know that the Neighbor-Joining algorithm always produces a phylogeny, and that it is not appropriate in the presence of reticulation.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-42258584439852677032009-09-05T00:05:32.516+03:002009-09-05T00:05:32.516+03:00I can't say I mind too much though, as this se...<i>I can't say I mind too much though, as this series has been doubly-educational: on both the interpretation of clustering and the perils of being an uneducated know-it-all.</i><br /><br />What, exactly, is wrong with being an uneducated know-it-all? Did I miss the part where you talked about your academic tenure or professional position? I have been a faithful reader for many years, and always assumed that you were simply an enthusiastic amateur -- and never thought less of you for it. Was I wrong?JSAhttps://www.blogger.com/profile/00681934865643964687noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-59495813594790795552009-09-03T20:04:21.518+03:002009-09-03T20:04:21.518+03:00I'm pretty sure the problem is not in your eye...<i>I'm pretty sure the problem is not in your eyes, but rather on the other end of the optic nerve.</i><br /><br />It seems you have ran out of excuses, and no longer pretend to address the substantive issues of my reply. <br /><br />Well, people can easily see that your superior plot shows Swedes closer to the Irish than to Germans and does not show them anywhere near five time closer to the Germans than to Russians. <br /><br />http://vizachero.com/images/TianNJ.pdf<br /><br />Your participation in this thread wouldn't have been as embarrassing for yourself if you had gracefully accepted your errors several posts ago, instead of piling on new ones.<br /><br />I can't say I mind too much though, as this series has been doubly-educational: on both the interpretation of clustering and the perils of being an uneducated know-it-all.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-25017695283687806542009-09-03T19:05:33.799+03:002009-09-03T19:05:33.799+03:00My eyes are just too bad for the superior "vi...<i>My eyes are just too bad for the superior "visual cues" provided by your NJ plot...</i><br />I'm pretty sure the problem is not in your eyes, but rather on the other end of the optic nerve.Vincenthttps://www.blogger.com/profile/00008012554198066886noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-86787964868855112692009-09-02T18:02:14.572+03:002009-09-02T18:02:14.572+03:00Or will you deny e.g., that your "superior tr...Or will you deny e.g., that your "superior tree" shows the Swedes to be closer to the Irish than to Germans, even though in reality they are about three times more distant, while my "distorted" tree correctly joins Swedes with Germans before joining them with the Irish.<br /><br />My eyes are just too bad for the superior "visual cues" provided by your NJ plot...Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-36882541745741961702009-09-02T17:56:03.836+03:002009-09-02T17:56:03.836+03:00Not only that, the distance on the tree between th...<i>Not only that, the distance on the tree between the Swedes and Russians is . . . wait for it . . . precisely 0.0036</i><br /><br />And the distance between Swedes and Germans is precisely 0.0007 I am guessing (not).<br /><br />Thankfully people have eyes and they can see that your "superior tree" in no way reflects that Germans and Swedes are closest to each other as they are really are, and in no way reflects that they are five times closer to each other than Swedes are to Russians.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-30359179849796514172009-09-02T17:03:10.391+03:002009-09-02T17:03:10.391+03:00BTW I wonder how come your "superior tree&quo...<i>BTW I wonder how come your "superior tree" places Russians and Swedes in a separate branch (Fst=0.0036), even though the Swedes are closest to Germans and vice versa (Fst=0.0.0007), while my "distorted" tree, correctly puts them in the same branch.</i><br /><br />NJ correctly places the Swedes closer to the Germans than to the Russians. <br /><br />Not only that, the distance on the tree between the Swedes and Russians is . . . wait for it . . . precisely 0.0036<br />.Vincenthttps://www.blogger.com/profile/00008012554198066886noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-14798135338115277692009-09-02T14:34:14.065+03:002009-09-02T14:34:14.065+03:00BTW I wonder how come your "superior tree&quo...BTW I wonder how come your "superior tree" places Russians and Swedes in a separate branch (Fst=0.0036), even though the Swedes are closest to Germans and vice versa (Fst=0.0.0007), while my "distorted" tree, correctly puts them in the same branch.<br /><br />According to your "logic":<br /><br /><i>A dendrogram (or any tree representation) is creating a hierarchical presentation: things which are most closely related are closest in the tree. Regardless of whether branch lengths are proportionate to distance or are additive, <b>this relative positioning is an undeniable trait</b>.<br /><br />So if you show A and B separated by one node while A+B are separated from C by an additional node, you are <b>inescapably representing</b> A as more closely related to B than to C.</i><br /><br />Swedes are thus "inescepably shown" by your "superior tree" to be closest to Russians, Irish, Orkney, Eastern Europe, and Germans in that order, while they are in fact closest to Germans, Irish, Eastern Europe, Russians, Orkney, which is precisely how they are "inescapably shown" in my distorted tree.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-49766711588903369222009-09-02T11:04:13.614+03:002009-09-02T11:04:13.614+03:00Don't focus on that "visual clue" as...<i>Don't focus on that "visual clue" as the only (or even main) benefit. If you do, you'll continue to miss the greater point, which is that using any ultrametric method on a non-ultrametric dataset is likely to produce a suboptimal clustering. If you think I am making that up, then you really need to get yourself an education on the topic.</i><br /><br />You are clearly not educated on the topic (your confusion about the difference between a dendrogram and an UPGMA tree, has clearly demonstrated this), so keep your advise to yourself.<br /><br />As for "suboptimal clustering", this is yet another example of your basic ignorance of what clustering is:<br /><br />"Suboptimal" implies that different clustering methods can be arranged in order of the goodness of the trees they produce. What objective criterion -your muddle-headed grammar school attempts to add up branch lengths don't count- do you propose for saying that one clustering method is better than another? IF we were dealing with a known phylogeny, then one could compare inferred trees against it and see which one is closest to it. However, in this case, there is no phylogeny (either known or unknown).<br /><br />Thus, your contention that NJ is inherently better than complete linkage hierarchical clustering is an empty slogan. <br /><br />First, your claim was based on the supposed ability of NJ to generate an accurate topology of the tree. Once it was pointed out to you that there is no accurate tree topology since these populations did not evolve tree-like, you switched to vague claims about "preserving Fst" better by adding up branch lengths. Once it was pointed out to you that "preserving Fst" is not the point of clustering, since you can look up Fst in the table, you switched to generalities about "suboptimal clustering" without giving a criterion of optimality.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-85963679823273001642009-09-02T04:08:27.720+03:002009-09-02T04:08:27.720+03:00I was trying to find out where did the Southern It...I was trying to find out where did the Southern Italian sample was taken but can't find it. I am under the impression from the Materials and Methods section that they are not a direct sample from Europe but a proxy taken from several US databases (selecting those declaring 4 grandparents). Is it possible that all them (they cluster quite tightly) are from areas genuinely Greek, i.e. from some coastal areas heaviliy colonized?Majuhttps://www.blogger.com/profile/12369840391933337204noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-2975610534924523552009-09-02T01:46:22.644+03:002009-09-02T01:46:22.644+03:00Lol, ok, take your ruler and add up branch lengths...<i>Lol, ok, take your ruler and add up branch lengths to get a piss-poor approximation of a number it takes me a second to look up in the original table.</i><br /><br />The point is not to reverse-engineer the matrix (though you could get an approximation if you had your hear set on it), but to give the user a visual clue that you don't get with your dendrogram.<br /><br />Don't focus on that "visual clue" as the only (or even main) benefit. If you do, you'll continue to miss the greater point, which is that using any ultrametric method on a non-ultrametric dataset is likely to produce a suboptimal clustering. If you think I am making that up, then you really need to get yourself an education on the topic.<br /><br />VVVincenthttps://www.blogger.com/profile/00008012554198066886noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-11589926195472106222009-09-02T01:30:04.437+03:002009-09-02T01:30:04.437+03:00the ability to estimate distances based on branch ...<i> the ability to estimate distances based on branch lengths is a real benefit.</i><br /><br />Lol, ok, take your ruler and add up branch lengths to get a piss-poor approximation of a number it takes me a second to look up in the original table.<br /><br />The point of using NJ is to infer a phylogeny, and NJ is completely inapplicable in this case.<br /><br />The point of CL hierarchical clustering on the other hand is to create similar clusters, and it succeeds admirably in achieving its purpose.<br /><br /><i>So if you show A and B separated by one node while A+B are separated from C by an additional node, you are inescapably representing A as more closely related to B than to C.</i><br /><br />I even drew you a picture, but you still don't seem to get it.<br /><br />http://i27.tinypic.com/9sqfb8.jpg<br /><br />If I say that the points in the bottom left blob belong to cluster A, and the points in the top right blob belong to cluster B, I am certainly NOT claiming that all points in A are closer to other points in A than they are to some points in B.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-24453234726678177812009-09-02T01:25:02.761+03:002009-09-02T01:25:02.761+03:00"A scalpel is a small but extremely sharp bla...<i>"A scalpel is a small but extremely sharp bladed instrument used for surgery, anatomical dissection, and various arts and crafts."</i><br /><br />The point being that the species (scalpel) has a more restricted field of applicability than the genus (knife), just as NJ has a more restricted field of applicability (phylogeny inference) than the genus (hierarchical clustering, which is not only used for phylogeny inference).Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-75485257413033686592009-09-02T01:21:44.426+03:002009-09-02T01:21:44.426+03:00The sum of branch lengths means nothing in either ...<i>The sum of branch lengths means nothing in either method; one is better off looking at the original distance table rather than trying to calculate it off the tree by "adding up branches" as you misguidedly did.</i><br /><br />The sum of branch lengths most certainly means something in an additive tree. Even when the fit is less than 100% between the tree distances and the matrix distances, the ability to estimate distances based on branch lengths is a real benefit.<br /><br />The real problem is trying to fit a non-ultrametric dataset (like Tian's) into an ultrametric method like CL or UPGMA. When you do that (and you did), you end up with a topology that is suboptimal AND branch lengths that are impossible to interpret.<br /><br />VVVincenthttps://www.blogger.com/profile/00008012554198066886noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-66640987363048271962009-09-02T01:10:14.399+03:002009-09-02T01:10:14.399+03:00The added degree of freedom you have with NJ (i.e....<i> The added degree of freedom you have with NJ (i.e. branch lengths actually meaning something) is not a trivial advantage.</i><br /><br />I am compelled to continue the free lesson, since your continued spread of misinformation may actually harm some of my readers.<br /><br />Branch lengths "actually mean something" in a dendrogram like the one in my blog post; they are the distances between either a pair of populations (if they are joined directly), or between populations and clusters, or between clusters.<br /><br />The sum of branch lengths means nothing in either method; one is better off looking at the original distance table rather than trying to calculate it off the tree by "adding up branches" as you misguidedly did.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-9097963795474252332009-09-02T01:07:35.983+03:002009-09-02T01:07:35.983+03:00You see -in a completely non-formal way- that some...<i>You see -in a completely non-formal way- that some branch sums in the NJ tree are kinda similar to some of the Fst's.</i><br /><br />You have not yet comprehended the real difference between CL and NJ, have you? Clearly not, and if not by now then probably never.<br /><br /><i>Ignorant of the fact that branch lengths CANNOT be added in the CL tree, you nonetheless add them and discover that "the Russians are shown being just as distant from the Spanish as from the Bedouins, even though according to the original data matrix they are much more distant from the Bedouins (fst=0.0211) than from the Spanish (fst=0.0079)." </i><br /><br />Read closer, and you'll see I am right. From the beginning I've been trying to bring you to this essential point. You lead yourself to it, but still can't admit it. A dendrogram (or any tree representation) is creating a hierarchical presentation: things which are most closely related are closest in the tree. Regardless of whether branch lengths are proportionate to distance or are additive, this relative positioning is an undeniable trait.<br /><br />So if you show A and B separated by one node while A+B are separated from C by an additional node, you are inescapably representing A as more closely related to B than to C.<br /><br />I pointed out that you have A being more closely related to B, when in truth A is more closely related to C. At least according to <i>fst</i>. And I made not only this empirical observation, but gave you the theory to explain why your approach led you to the mistake. And I gave you an alternate method that largely avoids the mistake.<br /><br />The least you could do is say "thank you".<br /><br />VVVincenthttps://www.blogger.com/profile/00008012554198066886noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-8971012991845545902009-09-02T00:43:16.568+03:002009-09-02T00:43:16.568+03:00As for your notion that your NJ tree is "supe...As for your notion that your NJ tree is "superior" to the CL tree, let's summarize your argument:<br /><br />1. You see -in a completely non-formal way- that some branch sums in the NJ tree are kinda similar to some of the Fst's.<br /><br />2. Ignorant of the fact that branch lengths CANNOT be added in the CL tree, you nonetheless add them and discover that "the Russians are shown being just as distant from the Spanish as from the Bedouins, even though according to the original data matrix they are much more distant from the Bedouins (fst=0.0211) than from the Spanish (fst=0.0079)." <br /><br />It's like a schoolkid reading on a tourist guide that the distance between Berlin and Paris is X, that the distance between Paris and Rome is Y, and then saying that the tourist guide is inaccurate because the distance between Berlin and Rome is not X+Y. It ain't the representation that is flawed, but its improper use.<br /><br />3. From this exercise, you conclude that the CL tree does not preserve Fst's rather than what we ought to conclude, i.e., that the whole point of building the CL tree isn't to "preserve Fst's" which is impossible for any hierarchical clustering method, but rather to group populations into clusters of similarity, a task in which it succeeds admirably.Dienekeshttps://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-32362746286647705112009-09-02T00:41:40.395+03:002009-09-02T00:41:40.395+03:00NJ is a method for inferring phylogenies. It does ...<i>NJ is a method for inferring phylogenies. It does not preserve Fst; sum of branch lengths is not equal to the Fst between a pair of populations.</i><br /><br />No. As you yourself observed above, NJ is a method of clustering. NJ can be used to infer phylogenies, but that is not what it IS.<br /><br />I agree that you'd have to be very lucky if you could get a NJ diagram that preserved each individual fst distance with perfect fidelity. The real world is messy, after all. And you can have negative fst . . . <br /><br />That doesn't change the fact that the NJ approach does a better job than your linkage method at visualizing the data from this paper. The added degree of freedom you have with NJ (i.e. branch lengths actually meaning something) is not a trivial advantage.<br /><br />VVVincenthttps://www.blogger.com/profile/00008012554198066886noreply@blogger.com