From the perspective of forensics genetics, the human microbiome is a rich, relatively untapped resource for human identity testing. Since it varies within and among people, and perhaps temporally, the potential forensic applications of the use of the microbiome can exceed that of human identification. However, the same inherent variability in microbial distributions may pose a substantial barrier to forming predictions on an individual as the source of the microbial sample unless stable signatures of the microbiome are identified and targeted. One of the more commonly adopted strategies for microbial human identification relies on quantifying which taxa are present and their respective abundance levels. It remains an open question if such microbial signatures are more individualizing than estimates of the degree of genetic relatedness between microbial samples. This study attempts to address this question by contrasting two prediction strategies. The first approach uses phylogenetic distance to predict the host individual; thus it operates under the premise that microbes within individuals are more closely related than microbes between/among individuals. The second approach uses population genetic measures of diversity at clade-specific markers, serving as a fine-grained assessment of microbial composition and quantification. Both assessments were performed using targeted sequencing of 286 markers from 22 microbial taxa sampled in 51 individuals across three body sites measured in triplicate. Nearest neighbor and reverse nearest neighbor classifiers were constructed based on the pooled data and yielded 71% and 78% accuracy, respectively, when diversity was considered, and performed significantly worse when a phylogenetic distance was used (54% and 63% accuracy, respectively). However, empirical estimates of classification accuracy were 100% when conditioned on a maximum nearest neighbor distance when diversity was used, while identification based on a phylogenetic distance failed to reach saturation. These findings suggest that microbial strain composition is more individualizing than that of a phylogeny, perhaps indicating that microbial composition may be more individualizing than recent common ancestry. One inference that may be drawn from these findings is that host-environment interactions may maintain the targeted microbial profile and that this maintenance may not necessarily be repopulated by intra-individual microbial strains.
- Human identification
- Massively parallel sequencing
- Next generation sequencing