TY - JOUR
T1 - Examining population stratification via individual ancestry estimates versus self-reported race
AU - Barnholtz-Sloan, Jill S.
AU - Chakraborty, Ranajit
AU - Sellers, Thomas A.
AU - Schwartz, Ann G.
PY - 2005/6
Y1 - 2005/6
N2 - Population stratification has the potential to affect the results of genetic marker studies. Estimating individual ancestry provides a continuous measure to assess population structure in case-control studies of complex disease, instead of using self-reported racial groups. We estimate individual ancestry using the Federal Bureau of Investigation CODIS Core short tandem repeat set of 13 loci using two different analysis methods in a case-control study of early-onset lung cancer. Individual ancestry proportions were estimated for "European" and "West African" groups using published allele frequencies. The majority of Caucasian, non-Hispanics had >50% European ancestry, whereas the majority of African Americans had <20% European ancestry, regardless of ancestry estimation method, although significant overlap by self-reported race and ancestry also existed. When we further investigated the effect of ancestry and self-reported race on the frequency of a lung cancer risk genotype, we found that the frequency of the GSTM1 null genotype varies by individual European ancestry and case-control status within self-reported race (particularly for African Americans). Genetic risk models showed that adjusting for individual European ancestry provided a better fit to the data compared with the model with no group adjustment or adjustment for self-reported race. This study suggests that significant population substructure differences exist that self-reported race alone does not capture and that individual ancestry may be confounded with disease status and/or a candidate gene risk genotype.
AB - Population stratification has the potential to affect the results of genetic marker studies. Estimating individual ancestry provides a continuous measure to assess population structure in case-control studies of complex disease, instead of using self-reported racial groups. We estimate individual ancestry using the Federal Bureau of Investigation CODIS Core short tandem repeat set of 13 loci using two different analysis methods in a case-control study of early-onset lung cancer. Individual ancestry proportions were estimated for "European" and "West African" groups using published allele frequencies. The majority of Caucasian, non-Hispanics had >50% European ancestry, whereas the majority of African Americans had <20% European ancestry, regardless of ancestry estimation method, although significant overlap by self-reported race and ancestry also existed. When we further investigated the effect of ancestry and self-reported race on the frequency of a lung cancer risk genotype, we found that the frequency of the GSTM1 null genotype varies by individual European ancestry and case-control status within self-reported race (particularly for African Americans). Genetic risk models showed that adjusting for individual European ancestry provided a better fit to the data compared with the model with no group adjustment or adjustment for self-reported race. This study suggests that significant population substructure differences exist that self-reported race alone does not capture and that individual ancestry may be confounded with disease status and/or a candidate gene risk genotype.
UR - http://www.scopus.com/inward/record.url?scp=20444364178&partnerID=8YFLogxK
U2 - 10.1158/1055-9965.EPI-04-0832
DO - 10.1158/1055-9965.EPI-04-0832
M3 - Article
C2 - 15941970
AN - SCOPUS:20444364178
VL - 14
SP - 1545
EP - 1551
JO - Cancer Epidemiology Biomarkers and Prevention
JF - Cancer Epidemiology Biomarkers and Prevention
SN - 1055-9965
IS - 6
ER -