I have already received some feedback from customers who also happen to be part of my Dodecad Project and who appear to be perplexed by their results. It is unfortunate that my own rules preclude me from discussing the details of these reports. I encourage people who want to discuss their ancestry composition to do so in the comments.
Without going into details, I would first advise that 23andMe make transparent the way in which 23andMe participants were selected as part of their training data. This is explained in their writeup with the following paragraph:
Most of the reference dataset comes from 23andMe members just like you. When someone tells us that they have four grandparents all born in the same country, and the country isn't a colonial nation like the US, Canada or Australia, they become candidates for inclusion in the reference dataset. We filter out all but one of any set of closely-related people, since they can distort the results. And we remove "outliers," people whose genetic ancestry doesn't seem to match up with their survey answers.23andMe takes a "birthplace of grandparents" approach rather than an "ethnic origin" approach. This may be reasonable when the two tend to coincide but not appropriate at all when ethnic groups of different origins co-exist in a given territory. Contrary to the implicit belief expressed in the above paragraph, ethnic complexity is not limited to "colonial nations", and an approach that disregards ethnicity, language, and religion, and limits itself to "birthplace of grandparents" is bound to miss it.
The problem with supervised learning is that the end product is only as good as the labels. If the labels aren't good, or they're ambiguous, then you end up with a mess.
Let's take an example of an individual who reports "4 grandparents from Turkey." This may mean anything ranging from a Mesopotamian Kurd within the boundaries of Turkey, a Central Anatolian Turk, a Cappadocian Greek, a Turkocretan, an Armenian from Cilicia, an ethnic Greek from European Turkey, or a Turkish-speaking Muslim from Skopje or Bulgaria. Some of these may interpret "Turkey" geographically; others ethnically. The label "Turkey" is polysemous, for a variety of reasons: it can be interpreted either geographically or ethnically, and in both these senses it has not been time-invariant.
I don't know how 23andMe built their reference populations, but I am ~100% sure that 4 grandparents from Turkey = "Middle Eastern" in their terminology. I am also fairly sure that their "Balkan" sample consists of individuals as different as Croats and Greeks. So what do these meta-population labels mean? Your guess is as good as mine: a balance of samples of different origins and different interpretations of these origins in whatever training set 23andMe assembled.
In my own project, I never include a priori labels of individuals in the inference of ancestral components. I deal with genotypes and individuals, not self-reported ancestral origins and labelled sets of individuals (populations). Components emerge from unsupervised learning over a set of individual genotypes, and it is only a posteriori that labels are assigned to the inferred components, by observation. Indeed, one could forego the assignment of labels altogether!
My amicable advice to 23andMe is to drop supervised learning altogether. It will only get worse as new customers (aka new test data) join in.