Back

Calibration of in-frame indel variant effect predictors for clinical variant classification

Abderrazzaq, H.; Singh, M.; Babb, L.; Bergquist, T.; Brenner, S. E.; Pejaver, V.; O'Donnell-Luria, A.; Radivojac, P.; ClinGen Computational Working Group, ; ClinGen Variant Classification Working Group,

2026-04-18 bioinformatics
10.64898/2026.04.15.718599 bioRxiv
Show abstract

Insertions and deletions (indels) represent a substantial source of genetic variation in humans and are associated with a diverse array of functional consequences. Despite their prevalence and clinical importance, indels, particularly short in-frame indels, remain critically understudied compared to single nucleotide variants and are challenging to interpret clinically. While many computational predictors for missense variants have been rigorously evaluated and calibrated for clinical use, the clinical utility of tools for in-frame indels remains uncertain. To address this gap, we have calibrated in-frame indel prediction tools for clinical variant classification. We constructed a high-confidence dataset of in-frame indel variants ([≤] 50bp) from clinical and population databases and estimated the prior probability of pathogenicity of a rare in-frame indel observed in a disease-associated gene, and of an insertion and deletion separately. Using a previously developed statistical framework based on local posterior probabilities, we then established score thresholds for eight computational tools, corresponding to distinct evidence levels for pathogenic and benign classification according to ACMG/AMP guidelines. All in-frame indel predictors evaluated here reached multiple evidence levels of pathogenicity and/or benignity, demonstrating measurable clinical value. However, these models consistently exhibited lower performance levels compared to missense predictors, highlighting the need for improved computational approaches for indel classification.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.1%
33.4%
2
The American Journal of Human Genetics
206 papers in training set
Top 0.1%
22.8%
50% of probability mass above
3
Nature Communications
4913 papers in training set
Top 35%
4.4%
4
PLOS Computational Biology
1633 papers in training set
Top 9%
3.6%
5
PLOS Genetics
756 papers in training set
Top 4%
3.6%
6
Nucleic Acids Research
1128 papers in training set
Top 7%
2.6%
7
Human Genetics
25 papers in training set
Top 0.1%
2.4%
8
Genome Biology
555 papers in training set
Top 3%
2.1%
9
Scientific Reports
3102 papers in training set
Top 53%
1.9%
10
Cell Genomics
162 papers in training set
Top 4%
1.3%
11
European Journal of Human Genetics
49 papers in training set
Top 0.9%
1.1%
12
PLOS ONE
4510 papers in training set
Top 62%
1.0%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.0%
14
Frontiers in Genetics
197 papers in training set
Top 7%
1.0%
15
Bioinformatics
1061 papers in training set
Top 9%
0.9%
16
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.9%
17
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
0.9%
18
eLife
5422 papers in training set
Top 55%
0.8%
19
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
20
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
21
Communications Biology
886 papers in training set
Top 28%
0.7%
22
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 49%
0.5%
23
Cell Reports Medicine
140 papers in training set
Top 10%
0.5%
24
Human Genomics
21 papers in training set
Top 0.6%
0.5%
25
BMC Genomics
328 papers in training set
Top 8%
0.5%
26
Genetic Epidemiology
46 papers in training set
Top 1%
0.5%
27
npj Genomic Medicine
33 papers in training set
Top 1%
0.5%
28
Nature Genetics
240 papers in training set
Top 9%
0.5%