Back

Ancestry-stratified variant classification in monogenic diabetes genes: annotation coverage and differential curation burden

Dario, P.

2026-04-07 genetic and genomic medicine
10.64898/2026.04.06.26350230 medRxiv
Show abstract

Variant databases ClinVar and gnomAD are the backbone of clinical variant interpretation, but their population composition is skewed toward European ancestry. Whether this skew creates systematic classification disadvantages for non-European patients with monogenic diabetes has not been examined at the database level. ClinVar variant_summary (GRCh38, April 2026; 4,421,188 variants) was cross-referenced with gnomAD v4.0 genome data for 17 monogenic diabetes genes. Annotation coverage and variant classification rates were computed stratified by genetic ancestry group (AFR, AMR, EAS, SAS, MID, NFE, FIN, ASJ). Of 14,691 gnomAD variants across the 17 genes, only 29.7% had any ClinVar classification (range: 12.7%-61.3% by gene). Among classified variants, non-Finnish European (NFE) variants had the highest variant of uncertain significance (VUS) rate (32.1%) and the lowest benign/likely benign fraction (41.6%), consistent with a large submission volume without functional follow-up. African-ancestry (AFR) variants showed the second-highest VUS rate (29.2%), not statistically distinguishable from NFE after Bonferroni correction, while all other non-European groups had significantly lower rates (all p < 0.001). GCK showed a pattern inversion - non-European VUS rate (18.5%) exceeding European (15.0%) - consistent with progressive reclassification in European populations absent in non-European cohorts. Annotation coverage and VUS divergence were uncorrelated (r = -0.15, p = 0.57). The primary equity problem is a 70% annotation gap combined with a non-European curation deficit, not a simple VUS excess. Ancestry-stratified evaluation of ClinGen Variant Curation Expert Panel (VCEP) criteria performance is warranted across disease domains.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.1%
18.8%
2
Genetics in Medicine
69 papers in training set
Top 0.2%
14.5%
3
The American Journal of Human Genetics
206 papers in training set
Top 0.5%
8.5%
4
Human Mutation
29 papers in training set
Top 0.1%
7.2%
5
npj Genomic Medicine
33 papers in training set
Top 0.1%
7.2%
50% of probability mass above
6
Diabetologia
36 papers in training set
Top 0.3%
4.6%
7
Nature Genetics
240 papers in training set
Top 3%
2.9%
8
Nature Communications
4913 papers in training set
Top 49%
1.8%
9
Diabetes Care
12 papers in training set
Top 0.2%
1.8%
10
Scientific Reports
3102 papers in training set
Top 58%
1.7%
11
Genetics in Medicine Open
10 papers in training set
Top 0.1%
1.5%
12
Human Genomics
21 papers in training set
Top 0.2%
1.3%
13
Human Genetics
25 papers in training set
Top 0.2%
1.2%
14
BMC Genomics
328 papers in training set
Top 3%
1.2%
15
Neurology Genetics
14 papers in training set
Top 0.1%
1.2%
16
Circulation: Genomic and Precision Medicine
42 papers in training set
Top 0.9%
1.1%
17
Cell Genomics
162 papers in training set
Top 5%
1.1%
18
eBioMedicine
130 papers in training set
Top 3%
1.0%
19
Diabetes
53 papers in training set
Top 0.5%
1.0%
20
Clinical and Translational Science
21 papers in training set
Top 0.8%
0.9%
21
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 1%
0.8%
22
PLOS ONE
4510 papers in training set
Top 68%
0.8%
23
PLOS Genetics
756 papers in training set
Top 15%
0.8%
24
Genetic Epidemiology
46 papers in training set
Top 0.8%
0.8%
25
Frontiers in Genetics
197 papers in training set
Top 10%
0.7%
26
Human Molecular Genetics
130 papers in training set
Top 4%
0.7%
27
International Journal of Epidemiology
74 papers in training set
Top 3%
0.6%
28
European Journal of Human Genetics
49 papers in training set
Top 2%
0.5%