Back

Are genetically defined "metapopulations" self-evident in YHRD?

Olsen, T. O. S.; Andersen, M. M.; Curran, J.; Krawczak, M.; Caliebe, A.

2026-02-10 genetics
10.64898/2026.02.07.704579 bioRxiv
Show abstract

In forensic genetics, the evidential value of a match between the Y-chromosomal short tandem repeat (Y-STR) profiles of a trace and a suspect is typically quantified by the frequency of the profile in a population database, particularly the Y-chromosomal Haplotype Reference Database (YHRD). However, for this approach of obtaining a match probability to be valid, the database population must be representative of all plausible alternative trace donors in a given case. Since appropriately defining such a suspect population can be difficult, YHRD highlights so-called metapopulations that comprise profiles from different, geographically dispersed populations with presumed shared ancestry. We investigated whether such metapopulations are self-evident in the current version of YHRD. To this end, we performed classical cluster analysis using allele dissimilarity as a measure of pairwise distance between Y-STR profiles. Our analyses revealed only a weak genetic structure in YHRD the extent of which was inversely proportional to the respective marker mutation rate. This suggests that YHRD cannot be divided into clearly distinguishable subgroups based solely on the genetic information it contains, at least not into subgroups that would correspond closely to the metapopulations highlighted in the database itself. If profile frequencies in metapopulations are to continue to be equated with match probabilities, then a clearer definition of metapopulations and a better justification of their use in forensics are needed.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.