TY - JOUR
T1 - Corrigendum to ‘U.S. Population Data for 29 Autosomal STR Loci’ [Forensic Sci. Int. Genet. 7 (2013) e82–e83](S1872497312002712)(10.1016/j.fsigen.2012.12.004)
AU - Steffen, Carolyn R.
AU - Coble, Michael D.
AU - Gettings, Katherine B.
AU - Vallone, Peter M.
N1 - Publisher Copyright:
© 2017
PY - 2017/11
Y1 - 2017/11
N2 - In 2013, we reported the genotypes and allele frequencies for 1036 unrelated samples in the U.S. population using capillary electrophoresis (CE) [1]. Since then, multiplex STR assays designed for sequencing technologies have become available, and we have re-analyzed our set of 1036 samples to determine sequence-based allele frequencies (manuscript in preparation). As a quality control for this sequence data and to evaluate back-compatibility, the calculated length-based genotypes from the sequence data were compared to the 2013 published CE genotypes. This comparison resulted in a list of differences which were further evaluated via sequence- and CE-data review. Instances in which the difference was not attributable to the sequencing assay were further evaluated with additional CE-based genotyping. This evaluation has resulted in revisions to the 2013 publication [1], detailed below. We have categorized the reasons for revisions as: (1) polymerase chain reaction (PCR) primer design differences, (2) change in the reporting of tri-alleles, (3) laboratory error, and (4) data analysis error. In summary, revisions have been made for a total of 13 STR loci, four of which are U.S. core loci (D5S818, D7S820, D13S317, and TPOX). The remaining nine loci are D6S1043, F13A01, F13B, FESFPS, LPL, Penta C, Penta D, Penta E, and SE33. The revisions affect 12 separate samples in the 1036 data set (12/1036 = 1.16%) and are summarized in Table 1. The distribution of revisions among the four populations is as follows: four African American samples (4/342 = 1.12%), three Caucasian samples (3/361 = 0.83%), four Hispanic samples (4/236 = 1.69%), and one Asian sample (1/97 = 1.03%). The revisions affect 37 genotypes out of 30,044 total genotypes (37/30,044 = 0.123%), not including the change in reporting of tri-alleles. The tri-allelic genotypes detected at TPOX (9, 10, 11) and Penta D (11, 14, 15) were reported as bi-allelic in 2013 (TPOX reported as 9, 11 and Penta D reported as 11, 14). In the revised data set, these genotypes have been removed. This change not only impacts the frequencies of the removed alleles, but also results in a sample number change at these loci: TPOX revised global n = 1035 and revised African American n = 341; Penta D revised global n = 1035 and revised Hispanic n = 235. Any change in sample number results in a change in all allele frequencies at the affected locus/population. A detailed presentation illustrating each of the revisions can be found at http://strbase.nist.gov/NISTpop.htm. Tables 2a–2d provides a summary of the revisions by population, locus, and specific allele(s) affected: original, revised, and the difference of revised − original. The maximum change in allele frequency by population was as follows 0.15% (African American), 0.28% (Caucasian), 0.71% (Hispanic), 1.0% (Asian). The greatest overall single change of 1.0% was observed for the 11 allele at D7S820 in the Asian population (n = 97). Similar to the 2015 Federal Bureau of Investigation (FBI) allele frequency revisions [2,3], empirical comparisons of random match probabilities (RMP) calculated from the original allele frequencies and the revised allele frequencies were performed on 100 randomly generated profiles for the two populations where U.S. core loci have been affected (African American and Asian). Comparisons were based on the original 13 U.S. core loci, as the expanded loci were not affected by the allele frequency changes. The random profiles were generated using DNA Profile Builder software (http://www.nucfs.ac.uk/dna-profile-builder/) using the allele frequencies from the NIST original. RMP calculations for the 100 random profiles were generated with the LSAM (Laboratory Statistical Analysis Module) software (Future Technologies Inc., Fairfax, VA) using a theta correction of 0.01 for homozygous loci. Since the corrections only affected markers in the original 13 U.S. core loci, we only calculated statistics on these markers. The differences in the African American population RMP calculations were within 1.0004-fold and the differences in the Asian population RMP calculations were within 1.3262-fold. This falls within a 2-fold change in RMP (comparable to the FBI's analysis [3]) and well within the 10-fold difference expected by using a different set of allele frequencies for that population as suggested by previous studies and the National Research Council [4–7], as shown in Fig. 1 and Table 3. RMP scenarios were calculated for each population assuming homozygosity at the affected loci and using a theta correction of 0.01. The analysis was performed to understand the scope of the “worst case” effect of the revisions. The bounds of less rare and more rare RMPs as a function of commonly used STR kits are tabulated in Supplemental Table 1. Using the Asian population as an example, RMPs of 1.22 fold less rare and 1.61 fold more rare were calculated for the loci contained in Identifiler, Globalfiler (Thermo Fisher), PowerPlex 16, PowerPlex Fusion/6C (Promega), and Investigator 24plex QS (Qiagen) STR kits. The revised genotypes for the 1036 U.S. population data set are provided in Supplemental Table 2 and the revised allele frequencies for the full data set and each population group are provided in Supplemental Table 3. The revised data have been provided to the FBI CODIS unit for review and dissemination and are also available on STRBase at http://strbase.nist.gov/NISTpop.htm. We encourage the forensic community to further evaluate the effects of these changes.
AB - In 2013, we reported the genotypes and allele frequencies for 1036 unrelated samples in the U.S. population using capillary electrophoresis (CE) [1]. Since then, multiplex STR assays designed for sequencing technologies have become available, and we have re-analyzed our set of 1036 samples to determine sequence-based allele frequencies (manuscript in preparation). As a quality control for this sequence data and to evaluate back-compatibility, the calculated length-based genotypes from the sequence data were compared to the 2013 published CE genotypes. This comparison resulted in a list of differences which were further evaluated via sequence- and CE-data review. Instances in which the difference was not attributable to the sequencing assay were further evaluated with additional CE-based genotyping. This evaluation has resulted in revisions to the 2013 publication [1], detailed below. We have categorized the reasons for revisions as: (1) polymerase chain reaction (PCR) primer design differences, (2) change in the reporting of tri-alleles, (3) laboratory error, and (4) data analysis error. In summary, revisions have been made for a total of 13 STR loci, four of which are U.S. core loci (D5S818, D7S820, D13S317, and TPOX). The remaining nine loci are D6S1043, F13A01, F13B, FESFPS, LPL, Penta C, Penta D, Penta E, and SE33. The revisions affect 12 separate samples in the 1036 data set (12/1036 = 1.16%) and are summarized in Table 1. The distribution of revisions among the four populations is as follows: four African American samples (4/342 = 1.12%), three Caucasian samples (3/361 = 0.83%), four Hispanic samples (4/236 = 1.69%), and one Asian sample (1/97 = 1.03%). The revisions affect 37 genotypes out of 30,044 total genotypes (37/30,044 = 0.123%), not including the change in reporting of tri-alleles. The tri-allelic genotypes detected at TPOX (9, 10, 11) and Penta D (11, 14, 15) were reported as bi-allelic in 2013 (TPOX reported as 9, 11 and Penta D reported as 11, 14). In the revised data set, these genotypes have been removed. This change not only impacts the frequencies of the removed alleles, but also results in a sample number change at these loci: TPOX revised global n = 1035 and revised African American n = 341; Penta D revised global n = 1035 and revised Hispanic n = 235. Any change in sample number results in a change in all allele frequencies at the affected locus/population. A detailed presentation illustrating each of the revisions can be found at http://strbase.nist.gov/NISTpop.htm. Tables 2a–2d provides a summary of the revisions by population, locus, and specific allele(s) affected: original, revised, and the difference of revised − original. The maximum change in allele frequency by population was as follows 0.15% (African American), 0.28% (Caucasian), 0.71% (Hispanic), 1.0% (Asian). The greatest overall single change of 1.0% was observed for the 11 allele at D7S820 in the Asian population (n = 97). Similar to the 2015 Federal Bureau of Investigation (FBI) allele frequency revisions [2,3], empirical comparisons of random match probabilities (RMP) calculated from the original allele frequencies and the revised allele frequencies were performed on 100 randomly generated profiles for the two populations where U.S. core loci have been affected (African American and Asian). Comparisons were based on the original 13 U.S. core loci, as the expanded loci were not affected by the allele frequency changes. The random profiles were generated using DNA Profile Builder software (http://www.nucfs.ac.uk/dna-profile-builder/) using the allele frequencies from the NIST original. RMP calculations for the 100 random profiles were generated with the LSAM (Laboratory Statistical Analysis Module) software (Future Technologies Inc., Fairfax, VA) using a theta correction of 0.01 for homozygous loci. Since the corrections only affected markers in the original 13 U.S. core loci, we only calculated statistics on these markers. The differences in the African American population RMP calculations were within 1.0004-fold and the differences in the Asian population RMP calculations were within 1.3262-fold. This falls within a 2-fold change in RMP (comparable to the FBI's analysis [3]) and well within the 10-fold difference expected by using a different set of allele frequencies for that population as suggested by previous studies and the National Research Council [4–7], as shown in Fig. 1 and Table 3. RMP scenarios were calculated for each population assuming homozygosity at the affected loci and using a theta correction of 0.01. The analysis was performed to understand the scope of the “worst case” effect of the revisions. The bounds of less rare and more rare RMPs as a function of commonly used STR kits are tabulated in Supplemental Table 1. Using the Asian population as an example, RMPs of 1.22 fold less rare and 1.61 fold more rare were calculated for the loci contained in Identifiler, Globalfiler (Thermo Fisher), PowerPlex 16, PowerPlex Fusion/6C (Promega), and Investigator 24plex QS (Qiagen) STR kits. The revised genotypes for the 1036 U.S. population data set are provided in Supplemental Table 2 and the revised allele frequencies for the full data set and each population group are provided in Supplemental Table 3. The revised data have been provided to the FBI CODIS unit for review and dissemination and are also available on STRBase at http://strbase.nist.gov/NISTpop.htm. We encourage the forensic community to further evaluate the effects of these changes.
UR - http://www.scopus.com/inward/record.url?scp=85028510883&partnerID=8YFLogxK
U2 - 10.1016/j.fsigen.2017.08.011
DO - 10.1016/j.fsigen.2017.08.011
M3 - Comment/debate
C2 - 28867528
AN - SCOPUS:85028510883
SN - 1872-4973
VL - 31
SP - e36-e40
JO - Forensic Science International: Genetics
JF - Forensic Science International: Genetics
ER -