Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements

Walther Parson, David Ballard, Bruce Budowle, John M. Butler, Katherine B. Gettings, Peter Gill, Leonor Gusmão, Douglas R. Hares, Jodi A. Irwin, Jonathan L. King, Peter De Knijff, Niels Morling, Mechthild Prinz, Peter M. Schneider, Christophe Van Neste, Sascha Willuweit, Christopher Phillips

Research output: Contribution to journalArticle

87 Citations (Scopus)

Abstract

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data.

Original languageEnglish
Pages (from-to)54-63
Number of pages10
JournalForensic Science International: Genetics
Volume22
DOIs
StatePublished - 1 May 2016

Fingerprint

Forensic Genetics
High-Throughput Nucleotide Sequencing
Terminology
Microsatellite Repeats
Capillary Electrophoresis
DNA
Alleles
Expert Systems
Minisatellite Repeats
Nucleotide Motifs
Tandem Repeat Sequences
Sequence Alignment
Information Storage and Retrieval
Nucleic Acid Databases
Genetic Polymorphisms
Single Nucleotide Polymorphism
Genotype
Genome
Technology

Keywords

  • MPS
  • Massively parallel sequencing
  • NGS
  • Next generation sequencing
  • Nomenclature
  • STRs
  • Short tandem repeats

Cite this

Parson, Walther ; Ballard, David ; Budowle, Bruce ; Butler, John M. ; Gettings, Katherine B. ; Gill, Peter ; Gusmão, Leonor ; Hares, Douglas R. ; Irwin, Jodi A. ; King, Jonathan L. ; Knijff, Peter De ; Morling, Niels ; Prinz, Mechthild ; Schneider, Peter M. ; Neste, Christophe Van ; Willuweit, Sascha ; Phillips, Christopher. / Massively parallel sequencing of forensic STRs : Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements. In: Forensic Science International: Genetics. 2016 ; Vol. 22. pp. 54-63.
@article{645919bf4a1d4462ba9b26b252c5011a,
title = "Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements",
abstract = "The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data.",
keywords = "MPS, Massively parallel sequencing, NGS, Next generation sequencing, Nomenclature, STRs, Short tandem repeats",
author = "Walther Parson and David Ballard and Bruce Budowle and Butler, {John M.} and Gettings, {Katherine B.} and Peter Gill and Leonor Gusm{\~a}o and Hares, {Douglas R.} and Irwin, {Jodi A.} and King, {Jonathan L.} and Knijff, {Peter De} and Niels Morling and Mechthild Prinz and Schneider, {Peter M.} and Neste, {Christophe Van} and Sascha Willuweit and Christopher Phillips",
year = "2016",
month = "5",
day = "1",
doi = "10.1016/j.fsigen.2016.01.009",
language = "English",
volume = "22",
pages = "54--63",
journal = "Forensic Science International: Genetics",
issn = "1872-4973",
publisher = "Elsevier Ireland Ltd",

}

Parson, W, Ballard, D, Budowle, B, Butler, JM, Gettings, KB, Gill, P, Gusmão, L, Hares, DR, Irwin, JA, King, JL, Knijff, PD, Morling, N, Prinz, M, Schneider, PM, Neste, CV, Willuweit, S & Phillips, C 2016, 'Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements', Forensic Science International: Genetics, vol. 22, pp. 54-63. https://doi.org/10.1016/j.fsigen.2016.01.009

Massively parallel sequencing of forensic STRs : Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements. / Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M.; Gettings, Katherine B.; Gill, Peter; Gusmão, Leonor; Hares, Douglas R.; Irwin, Jodi A.; King, Jonathan L.; Knijff, Peter De; Morling, Niels; Prinz, Mechthild; Schneider, Peter M.; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher.

In: Forensic Science International: Genetics, Vol. 22, 01.05.2016, p. 54-63.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Massively parallel sequencing of forensic STRs

T2 - Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements

AU - Parson, Walther

AU - Ballard, David

AU - Budowle, Bruce

AU - Butler, John M.

AU - Gettings, Katherine B.

AU - Gill, Peter

AU - Gusmão, Leonor

AU - Hares, Douglas R.

AU - Irwin, Jodi A.

AU - King, Jonathan L.

AU - Knijff, Peter De

AU - Morling, Niels

AU - Prinz, Mechthild

AU - Schneider, Peter M.

AU - Neste, Christophe Van

AU - Willuweit, Sascha

AU - Phillips, Christopher

PY - 2016/5/1

Y1 - 2016/5/1

N2 - The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data.

AB - The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data.

KW - MPS

KW - Massively parallel sequencing

KW - NGS

KW - Next generation sequencing

KW - Nomenclature

KW - STRs

KW - Short tandem repeats

UR - http://www.scopus.com/inward/record.url?scp=84970935853&partnerID=8YFLogxK

U2 - 10.1016/j.fsigen.2016.01.009

DO - 10.1016/j.fsigen.2016.01.009

M3 - Article

C2 - 26844919

AN - SCOPUS:84970935853

VL - 22

SP - 54

EP - 63

JO - Forensic Science International: Genetics

JF - Forensic Science International: Genetics

SN - 1872-4973

ER -