Short tandem repeats (STRs) are the primary genetic markers used for the analysis of biological samples in forensic and human identity testing. The discrimination power of a combination of STRs is sufficient in many human identity testing comparisons unless the evidence is substantially compromised and/or there are insufficient relatives or a potential mutation may have arisen in kinship analyses. An automated STR assay system that is based on electrospray ionization mass spectrometry (ESI-MS) has been developed that can increase the discrimination power of some of the CODIS core STR loci and thus provide more information in typical and challenged samples and cases. Data from the ESI-MS STR system is fully backwards compatible with existing STR typing results generated by capillary electrophoresis. In contrast, however, the ESI-MS analytical system also reveals nucleotide polymorphisms residing within the STR alleles. The presence of these polymorphisms expands the number of alleles at a locus. Population studies were performed on the 13 core CODIS STR loci from African Americans, Caucasians and Hispanics capturing both the length of the allele, as well as nucleotide variations contained within repeat motifs or flanking regions. Such additional polymorphisms were identified in 11 of the 13 loci examined whereby several nominal length alleles were subdivided. A substantial increase in heterozygosity was observed, with close to or greater than 5% of samples analyzed being heterozygous with equal-length alleles in at least one of five of the core CODIS loci. This additional polymorphism increases discrimination power significantly, whereby the seven most polymorphic STR loci have a discrimination power equivalent to the 10 most discriminating of the CODIS core loci. An analysis of substructure among the three population groups revealed a higher θ than would be observed compared with using alleles designated by nominal length, i.e., repeats solely. Two loci, D3S1358 and vWA produced θ estimates of 0.0477 and 0.0234, respectively, when the expanded allele complement (i.e., nominal allele and SNPs) was considered compared to 0.0145 and 0.01266, respectively when only nominal repeat number was considered. These differences may indicate underlying population specific allele distributions exist within these populations. A system of nomenclature has been developed that facilitates the databasing, searching and analyses of these combined data forms.
- Mass spectrometry
- Population studies
- Short tandem repeats
- Single nucleotide polymorphism