Back

AFQuery: a bitmap-indexed, capture-aware allele frequency engine for clinical genomics cohorts

Santos-Diaz, G.; Toro-Barrios, N.; Carmona, R.; Uria-Regojo, G.; Jimenez-Arias, R.; Gurriaran, X.; Ramilo, P.; Amigo, J.; Minguez, P.; Dopazo, J.; Lopez-Lopez, D.

2026-05-22 genetic and genomic medicine
10.64898/2026.05.15.26353174 medRxiv
Show abstract

Motivation: Allele frequency (AF) is central to clinical variant classification under ACMG/AMP guidelines. Public reference databases offer broad ancestry coverage, but local ancestries, rare-disease enrichment, and institutional case distributions are often underrepresented, so cohort-derived AF is a valuable complement. Computing accurate AF from institutional cohorts is nonetheless error-prone: even successive versions of the same capture kit cover substantially different target regions, and naive methods inflate the allele number (AN) at positions not shared by all kits, deflating AF and biasing ACMG frequency evidence toward pathogenic categories. Results: We present AFQuery, a bitmap-indexed AF engine that computes capture-aware, ploidy-aware allele frequencies from pre-indexed Roaring Bitmaps in {approx}14 ms per point query ({approx}34 ms for 1-Mbp region queries), independently of cohort size up to 50,000 samples. In simulated mixed-technology cohorts, capture-aware AN reduced AF mean absolute error 8-13-fold and removed the systematic bias toward pathogenic ACMG categories, yielding 10-45-fold fewer spurious pathogenic-evidence calls. Availability: AFQuery is freely available under the MIT licence at https://github.com/babelomics/afquery.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.