Back

mehari: high-performance, strict HGVS-first variant effect prediction

Hartmann, T. F.; Zhao, M. X.; Beule, D.; Holtgrewe, M.

2026-05-14 bioinformatics
10.64898/2026.05.12.724271 bioRxiv
Show abstract

Variant annotation requires the precise and consistent computation of Sequence Ontology (SO) terms and Human Genome Variation Society (HGVS) nomenclature. To ensure robust synchronization between these two key facets, we present mehari, a high-performance variant effect predictor implemented in Rust that employs a strict "HGVS-first" approach. By deterministically projecting variants to transcripts before evaluating functional consequences, mehari structurally aligns HGVS notation and SO terms. Benchmarking on ClinVar demonstrates that mehari achieves exceptional processing speeds and high concordance with established tools like Ensembl VEP, while also providing refined handling for complex biological edge cases such as selenoprotein recoding.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
14.5%
2
Nature Communications
4913 papers in training set
Top 11%
14.2%
3
Nucleic Acids Research
1128 papers in training set
Top 2%
10.0%
4
Genome Medicine
154 papers in training set
Top 0.5%
10.0%
5
Nature Biotechnology
147 papers in training set
Top 1%
6.7%
50% of probability mass above
6
Genome Biology
555 papers in training set
Top 0.9%
6.7%
7
Nature Methods
336 papers in training set
Top 2%
6.3%
8
BMC Bioinformatics
383 papers in training set
Top 3%
2.6%
9
The American Journal of Human Genetics
206 papers in training set
Top 2%
2.3%
10
PLOS ONE
4510 papers in training set
Top 48%
2.1%
11
Genome Research
409 papers in training set
Top 2%
2.0%
12
Bioinformatics Advances
184 papers in training set
Top 2%
1.9%
13
Cell Systems
167 papers in training set
Top 7%
1.7%
14
PLOS Genetics
756 papers in training set
Top 10%
1.3%
15
Scientific Reports
3102 papers in training set
Top 64%
1.3%
16
Communications Biology
886 papers in training set
Top 15%
1.2%
17
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.1%
18
Science
429 papers in training set
Top 19%
0.9%
19
Nature Genetics
240 papers in training set
Top 7%
0.9%
20
PLOS Computational Biology
1633 papers in training set
Top 22%
0.9%
21
Cell Genomics
162 papers in training set
Top 6%
0.8%
22
Nature
575 papers in training set
Top 16%
0.7%
23
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
24
Nature Computational Science
50 papers in training set
Top 2%
0.7%