I began by calculating the correlation matrix in my sample.
A few features strike the eye:
- The negative correlation between haplogroup R1 and haplogroups E3b, J2, and R1b
- The negative correlation between haplogroup I and haplogroups J2 and R1b
- The positive correlation between haplogroup J2 and haplogroup R1b
- The absence of a substantial correlation between "Neolithic" haplogroups J2 and E3b
The absence of a correlation between J2 and E3b is significant, because it hints that these haplogroups did not diffuse as a result of a single process. The eastern-most populations of our sample, but also the two Italian populations show a higher J2/E3b ratio compared to the "continental" populations.
The second analysis is a dendrogram using Euclidean distance of the normalized haplogroup frequencies. As is apparent, this way of representing the frequency data results in a separation of the two main clusters.
Finally, a principal components analysis is shown in the following plot. The first two components summarize about 77% of the variance.
We observe the two main "contrasts" in the data between "coastal" J2/R1b and "continental" I1b and between "Neolithic" E3b and "Slavic" R1a (*)
Several conclusions can be drawn.
- The spread of the Neolithic economy into continental Europe involved E3b bearers in a riverine expansion whose northern expression is associated with the Linearbandkeramik. This does not mean that E3b was the only haplogroup associated with these early European farmers, only that it definitely seems to correlate better with this movement compared to the other Neolithic haplogroup (J2).
- The early diffusion of E3b occurred over a haplogroup I Paleolithic background. It is likely that as groups moved northward the frequency of haplogroup E3b abated, and this is in fact shown in the frequency distribution. This movement is probably associated with the narrow-faced Danubian Mediterranean racial types.
- This native European population later received an influx of R1a speakers; the frequency of R1a is correlated with latitude. This led to a decrease of the native component in favor of the foreign R1a component (*)
- The frequency of haplogroup J2 was established by three movements: (i) the initial arrival of J2 from Asia Minor; this did not significantly penetrate into the Western Balkans; (ii) the initial dispersal of J2 into Italy and further west, and around the Black Sea in pre-Greek times, which may be associated with the arrival of gracile Mediterranean racial types into the Ukraine; (iii) the latter dispersal of additional J2 as a result of Greek colonization.
The critical question would be: what fraction of J2 lineages in the Ukraine can be explained as the result of ancient and recent Greek settlement in the Crimea, and what fraction predates the Greeks?
(*) We should note that these are rough correspondences. If the theory of riverine diffusion of haplogroup E3b into Central and Northern Europe is correct, then it is likely that E3b existed in a small frequency in Proto-Slavs; conversely, R1a diffused after the LGM before its most recent diffusion associated perhaps with Slavic languages.
Update: A reader alerts me to a different study which listed the Hungarian R1a frequency as substantially lower than the one used here (Semino et al. 2000). Unfortunately, that study did not list frequencies of all haplogroups needed for comparison, so it could not be used directly. If the frequency of R1a=20.4% is used, then a slightly different clustering is obtained.