The average squared distance using the median allele tends to be lower (on average) than if the real ancestral allele was used. So, while the equation ASD = μg is appropriate if the real ancestral allele is used (where μ is the mutation rate and g the number of generations since the MRCA), if the median allele is used, then ASD=wag is appropriate where wa is the effective mutation rate.
Unfortunately this effective rate depends on population history, becoming close to the germline rate μ if the haplogroup attains a large size early on after the MRCA. As I have argued in many recent posts, for the large observed modern haplogroups, it is very likely that the effective rate should be close to the germline rate; yet, the discrepancy between the two introduces an element of uncertainty in the calculation of TMRCA (=g)
Naturally, it's obvious to ask: how often does the median allele equal the real ancestral allele? In my simulations, I set g to 10, 100, or 300 generations, and the growth constant m to 1 or 1.02. I expected the median allele to equal the ancestral allele more often for a younger group (less time for it to get obfuscated by the passage of time), and also for a more rapidly expanding group (a more star-like pattern of expansion).
This intuition is essentially correct. It is for younger, and more rapidly expanding groups that the ancestral allele is estimated most accurately.
It is also worthwhile to see how using other methods of estimating ancestral alleles (e.g. building a rooted haplotype tree) would perform. I have only personally carried out experiments where the modal (most frequent) allele is used.
Using the modal allele tends to be a right guess slightly more often than the median one, giving a less biased estimate of the age. However, when the modal allele fails, it fails spectacularly: the median allele is conservative, being right in the middle of the observed alleles, whereas the modal allele may be observed for either a very small or very high number of repeats, which may be a long way off from the ancestral value in each particular case. Hence, the modal allele leads to age estimates with a higher variance.
Hence, I am in favor of the use of the median allele as an estimator of the ancestral one.
Appendix: age estimates using median or real ancestral allele
Below are the age estimates (ASD/μ) using either the median or the (unknown) ancestral allele.
|g||m||(median allele)||(ancestral allele)|