Back

Using the ancestral recombination graph to study the history of rare variants in founder populations

Mejia-Garcia, A.; Diaz-Papkovich, A.; Sillon, G.; D'Agostino, D.; Chong, A.-L.; Chong, G.; Lo, K. S.; Baret, L.; Hamel, N.; Chapdelaine, V.; Foulkes, W. D.; Taliun, D.; Shapiro, A. J.; Lettre, G.; Gravel, S.

2025-03-13 genetics
10.1101/2025.03.13.643149 bioRxiv
Show abstract

Gene genealogies represent the ancestry of a sample and are often encoded as ancestral recombination graphs (ARG). It has recently become possible to infer these gene genealogies from sequencing or genotyping data and use them for evolutionary and statistical genetics. Unfortunately, inferred gene genealogies can be noisy and subject to biases, making their applications more challenging. This project aims to study the application of ARG methods to systematically impute and trace the transmission of all disease variants in founder populations where long-shared haplotypes allow for accurate timing of relatedness. We applied these methods to the population of Quebec, where multiple founder events led to an uneven distribution of pathogenic variants across regions and where extensive population pedigrees are available. We validated our approach with nine founder mutations for the SLSJ region, demonstrating high accuracy for mutation age, imputation, and regional frequency estimation. Moreover, we showed that this subset of high-quality carriers is sufficient to capture previously described associations with pathogenic variants in the LPL gene. This method systematically characterizes rare variants in founder populations, establishing a fast and accurate approach to inform genetic screening programs.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
European Journal of Human Genetics
49 papers in training set
Top 0.1%
22.1%
2
PLOS Genetics
756 papers in training set
Top 0.2%
22.1%
3
The American Journal of Human Genetics
206 papers in training set
Top 0.3%
14.5%
50% of probability mass above
4
Bioinformatics
1061 papers in training set
Top 5%
4.8%
5
Genetic Epidemiology
46 papers in training set
Top 0.2%
3.9%
6
Genome Medicine
154 papers in training set
Top 2%
3.9%
7
Nature Communications
4913 papers in training set
Top 41%
3.5%
8
PLOS Computational Biology
1633 papers in training set
Top 13%
2.3%
9
Scientific Reports
3102 papers in training set
Top 51%
2.0%
10
BMC Bioinformatics
383 papers in training set
Top 5%
1.7%
11
GENETICS
189 papers in training set
Top 0.8%
1.5%
12
Frontiers in Genetics
197 papers in training set
Top 6%
1.5%
13
Genome Research
409 papers in training set
Top 3%
1.5%
14
Human Genetics
25 papers in training set
Top 0.2%
1.3%
15
Human Genetics and Genomics Advances
70 papers in training set
Top 0.6%
0.9%
16
eLife
5422 papers in training set
Top 56%
0.8%
17
Journal of Medical Genetics
28 papers in training set
Top 0.6%
0.7%
18
Nature Genetics
240 papers in training set
Top 8%
0.7%
19
Forensic Science International: Genetics
24 papers in training set
Top 0.1%
0.7%
20
Human Mutation
29 papers in training set
Top 0.8%
0.7%
21
Genetics Selection Evolution
33 papers in training set
Top 0.2%
0.6%
22
Nucleic Acids Research
1128 papers in training set
Top 20%
0.6%