Measures of variation at DNA repeat loci under a general stepwise mutation model

Marek Kimmel, Ranajit Chakraborty

Research output: Contribution to journalArticlepeer-review

70 Scopus citations


Polymorphisms at tandem repeat loci are caused by mutations with allele sizes occasionally altered by more than one repeat unit in both forward and backward directions. Such mutational changes may occur with asymmetric probabilities. Therefore, a one-step symmetric stepwise mutation model may not be appropriate for studying the population dynamics at all repeat loci. In this work, we evaluated the expectation and variance of the within- population variance of the allele size distribution in a finite population, and the expected homozygosity at a locus by the coalescence approach under a general stepwise mutation model, where mutational transitions of allele sizes can be arbitrary, including being asymmetric. Under the special cases of symmetric one-step, two-step, and multi-step geometric distributions of mutations, our general results reduce to the corresponding results obtained by earlier investigators. The general results indicate that in a finite population, which has reached a steady state under the (general stepwise) mutation and drift balance, the within-population variance of allele sizes has a simple expectation (i.e., proportional to Nv. the product of the mutation rate, v, and effective population size, N). However, its stochastic variance is a quadratic function of this composite parameter, Nv. Furthermore, this second-order variance does not decay with the number of alleles sampled from a population. Application of this theory to data on allele size distributions in unrelated Caucasians from the CEPH pedigree (obtained from the Genome Data Base) shows that the relationship of the variance and mean of within-population variance of allele size at tandem repeat loci, grouped by their chromosomal assignment, has a trend compatible with the theory. However, there is an indication that the second-order variance is generally underestimated. One reason for this departure might be that the CEPH sample may not represent a single homogeneous population that reached equilibrium at all tandem repeat loci.

Original languageEnglish
Pages (from-to)345-367
Number of pages23
JournalTheoretical Population Biology
Issue number3
StatePublished - Dec 1996


Dive into the research topics of 'Measures of variation at DNA repeat loci under a general stepwise mutation model'. Together they form a unique fingerprint.

Cite this