Back

Graph transformer for ancient ancestry inference

Shanks, C.; Bonet, D.; Comajoan Cara, M.; Ioannidis, A. G.

2026-04-07 genetics
10.64898/2026.04.05.714076 bioRxiv
Show abstract

Local ancestry inference classifies segments of DNA in admixed individuals by their originating population. However, as the date of admixture becomes older, these segments become shorter and determining their ancestry becomes increasingly difficult. This limits many existing segment-based methods to relatively recent historical admixture events and more highly diverged populations. The rapidly expanding availability of ancient DNA offers a promising opportunity to use these ancient samples as references for local ancestry inference. A recent approach integrates ancient samples into the ancestral recombination graph (ARG) for local ancestry inference. Here, we introduce recent advances in deep learning for graphs into this ARG framework to create ARGMix, a graph transformer that infers local ancestry using the coalescent trees of the inferred ARG. Our approach employs ancient samples as references in the marginal trees to predict local ancestry. We train ARGMix on data reflecting the well-understood ancient European demography and demonstrate improved accuracy and robustness even under demographic misspecification. We then apply ARGMix to an ARG of ancient and present-day European samples for ancestry-specific analyses, finding evidence of continuity between Otzi the Iceman and present-day individuals from nearby regions.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
The American Journal of Human Genetics
206 papers in training set
Top 0.3%
13.7%
2
Nature Communications
4913 papers in training set
Top 20%
9.7%
3
Nature Genetics
240 papers in training set
Top 0.8%
8.7%
4
Science
429 papers in training set
Top 6%
6.1%
5
Genome Biology
555 papers in training set
Top 1%
6.0%
6
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 13%
6.0%
50% of probability mass above
7
Cell
370 papers in training set
Top 7%
3.4%
8
Nature Biotechnology
147 papers in training set
Top 3%
3.4%
9
Nature
575 papers in training set
Top 7%
3.4%
10
Bioinformatics
1061 papers in training set
Top 6%
3.4%
11
PLOS Genetics
756 papers in training set
Top 5%
3.1%
12
Cell Genomics
162 papers in training set
Top 3%
2.0%
13
Genome Research
409 papers in training set
Top 2%
1.8%
14
Nucleic Acids Research
1128 papers in training set
Top 10%
1.8%
15
Genetics
225 papers in training set
Top 3%
1.6%
16
Molecular Biology and Evolution
488 papers in training set
Top 3%
1.6%
17
Cell Reports
1338 papers in training set
Top 25%
1.6%
18
GENETICS
189 papers in training set
Top 0.8%
1.4%
19
Frontiers in Genetics
197 papers in training set
Top 6%
1.4%
20
Communications Biology
886 papers in training set
Top 13%
1.3%
21
eLife
5422 papers in training set
Top 51%
1.1%
22
PLOS Computational Biology
1633 papers in training set
Top 23%
0.9%
23
iScience
1063 papers in training set
Top 31%
0.8%
24
Science Translational Medicine
111 papers in training set
Top 6%
0.8%
25
Nature Computational Science
50 papers in training set
Top 2%
0.8%
26
Science Advances
1098 papers in training set
Top 31%
0.7%
27
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
28
Scientific Reports
3102 papers in training set
Top 77%
0.7%
29
European Journal of Human Genetics
49 papers in training set
Top 2%
0.6%
30
Briefings in Bioinformatics
326 papers in training set
Top 8%
0.6%