TY - JOUR
T1 - Measures of variation at DNA repeat loci under a general stepwise mutation model
AU - Kimmel, Marek
AU - Chakraborty, Ranajit
N1 - Funding Information:
We thank Professor Olle Nerman of the University of Gotheborg for providing insights underlying some of the derivations. This work was supported by Grants GM 41399 (R.C.) and GM 58545 (R.C., M.K.) from the National Institutes of Health, and DMS 9203436 and DMS 9409909 (M.K.) from the National Science Foundation and by the Keck’s Center for Computational Biology at the Rice University (M.K.). Part of this work was carried out during M.K.’s visit at the University of Gotheborg in September 1995.
PY - 1996/12
Y1 - 1996/12
N2 - Polymorphisms at tandem repeat loci are caused by mutations with allele sizes occasionally altered by more than one repeat unit in both forward and backward directions. Such mutational changes may occur with asymmetric probabilities. Therefore, a one-step symmetric stepwise mutation model may not be appropriate for studying the population dynamics at all repeat loci. In this work, we evaluated the expectation and variance of the within- population variance of the allele size distribution in a finite population, and the expected homozygosity at a locus by the coalescence approach under a general stepwise mutation model, where mutational transitions of allele sizes can be arbitrary, including being asymmetric. Under the special cases of symmetric one-step, two-step, and multi-step geometric distributions of mutations, our general results reduce to the corresponding results obtained by earlier investigators. The general results indicate that in a finite population, which has reached a steady state under the (general stepwise) mutation and drift balance, the within-population variance of allele sizes has a simple expectation (i.e., proportional to Nv. the product of the mutation rate, v, and effective population size, N). However, its stochastic variance is a quadratic function of this composite parameter, Nv. Furthermore, this second-order variance does not decay with the number of alleles sampled from a population. Application of this theory to data on allele size distributions in unrelated Caucasians from the CEPH pedigree (obtained from the Genome Data Base) shows that the relationship of the variance and mean of within-population variance of allele size at tandem repeat loci, grouped by their chromosomal assignment, has a trend compatible with the theory. However, there is an indication that the second-order variance is generally underestimated. One reason for this departure might be that the CEPH sample may not represent a single homogeneous population that reached equilibrium at all tandem repeat loci.
AB - Polymorphisms at tandem repeat loci are caused by mutations with allele sizes occasionally altered by more than one repeat unit in both forward and backward directions. Such mutational changes may occur with asymmetric probabilities. Therefore, a one-step symmetric stepwise mutation model may not be appropriate for studying the population dynamics at all repeat loci. In this work, we evaluated the expectation and variance of the within- population variance of the allele size distribution in a finite population, and the expected homozygosity at a locus by the coalescence approach under a general stepwise mutation model, where mutational transitions of allele sizes can be arbitrary, including being asymmetric. Under the special cases of symmetric one-step, two-step, and multi-step geometric distributions of mutations, our general results reduce to the corresponding results obtained by earlier investigators. The general results indicate that in a finite population, which has reached a steady state under the (general stepwise) mutation and drift balance, the within-population variance of allele sizes has a simple expectation (i.e., proportional to Nv. the product of the mutation rate, v, and effective population size, N). However, its stochastic variance is a quadratic function of this composite parameter, Nv. Furthermore, this second-order variance does not decay with the number of alleles sampled from a population. Application of this theory to data on allele size distributions in unrelated Caucasians from the CEPH pedigree (obtained from the Genome Data Base) shows that the relationship of the variance and mean of within-population variance of allele size at tandem repeat loci, grouped by their chromosomal assignment, has a trend compatible with the theory. However, there is an indication that the second-order variance is generally underestimated. One reason for this departure might be that the CEPH sample may not represent a single homogeneous population that reached equilibrium at all tandem repeat loci.
UR - http://www.scopus.com/inward/record.url?scp=0030447670&partnerID=8YFLogxK
U2 - 10.1006/tpbi.1996.0035
DO - 10.1006/tpbi.1996.0035
M3 - Article
C2 - 9000494
AN - SCOPUS:0030447670
SN - 0040-5809
VL - 50
SP - 345
EP - 367
JO - Theoretical Population Biology
JF - Theoretical Population Biology
IS - 3
ER -