TY - JOUR
T1 - Graph algorithms for mixture interpretation
AU - Crysup, Benjamin
AU - Woerner, August E.
AU - King, Jonathan L.
AU - Budowle, Bruce
N1 - Funding Information:
Funding: This research was funded by the National Institutes of Justice, Office of Justice Programs, U.S. Department of Justice, grant number 2017-DN-BX-0134. The opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect those of the U.S. Department of Justice.
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/1
Y1 - 2021/1
N2 - The scale of genetic methods are presently being expanded: forensic genetic assays previously were limited to tens of loci, but now technologies allow for a transition to forensic genomic approaches that assess thousands to millions of loci. However, there are subtle distinctions between genetic assays and their genomic counterparts (especially in the context of forensics). For instance, forensic genetic approaches tend to describe a locus as a haplotype, be it a microhaplotype or a short tandem repeat with its accompanying flanking information. In contrast, genomic assays tend to provide not haplotypes but sequence variants or differences, variants which in turn describe how the alleles apparently differ from the reference sequence. By the given construction, mitochondrial genetic assays can be thought of as genomic as they often describe genetic differences in a similar way. The mitochondrial genetics literature makes clear that sequence differences, unlike the haplotypes they encode, are not comparable to each other. Different alignment algorithms and different variant calling conventions may cause the same haplotype to be encoded in multiple ways. This ambiguity can affect evidence and reference profile comparisons as well as how “match” statistics are computed. In this study, a graph algorithm is described (and implemented in the MMDIT (Mitochondrial Mixture Database and Interpretation Tool) R package) that permits the assessment of forensic match statistics on mitochondrial DNA mixtures in a way that is invariant to both the variant calling conventions followed and the alignment parameters considered. The algorithm described, given a few modest constraints, can be used to compute the “random man not excluded” statistic or the likelihood ratio. The performance of the approach is assessed in in silico mitochondrial DNA mixtures.
AB - The scale of genetic methods are presently being expanded: forensic genetic assays previously were limited to tens of loci, but now technologies allow for a transition to forensic genomic approaches that assess thousands to millions of loci. However, there are subtle distinctions between genetic assays and their genomic counterparts (especially in the context of forensics). For instance, forensic genetic approaches tend to describe a locus as a haplotype, be it a microhaplotype or a short tandem repeat with its accompanying flanking information. In contrast, genomic assays tend to provide not haplotypes but sequence variants or differences, variants which in turn describe how the alleles apparently differ from the reference sequence. By the given construction, mitochondrial genetic assays can be thought of as genomic as they often describe genetic differences in a similar way. The mitochondrial genetics literature makes clear that sequence differences, unlike the haplotypes they encode, are not comparable to each other. Different alignment algorithms and different variant calling conventions may cause the same haplotype to be encoded in multiple ways. This ambiguity can affect evidence and reference profile comparisons as well as how “match” statistics are computed. In this study, a graph algorithm is described (and implemented in the MMDIT (Mitochondrial Mixture Database and Interpretation Tool) R package) that permits the assessment of forensic match statistics on mitochondrial DNA mixtures in a way that is invariant to both the variant calling conventions followed and the alignment parameters considered. The algorithm described, given a few modest constraints, can be used to compute the “random man not excluded” statistic or the likelihood ratio. The performance of the approach is assessed in in silico mitochondrial DNA mixtures.
KW - Graph algorithm
KW - Massively parallel sequencing
KW - Mitochondrial mixtures
KW - Mixture interpretation
KW - Probabilistic genotyping
UR - http://www.scopus.com/inward/record.url?scp=85100677862&partnerID=8YFLogxK
U2 - 10.3390/genes12020185
DO - 10.3390/genes12020185
M3 - Article
C2 - 33514030
AN - SCOPUS:85100677862
SN - 2073-4425
VL - 12
JO - Genes
JF - Genes
IS - 2
M1 - 185
ER -