Back

Global population frequencies of NAT2 star alleles observed in three large biobanks

Sangkuhl, K.; Whirl-Carrillo, M.; Woon, M.; Venkatesh, R.; Keat, K.; Whaley, R.; Ritchie, M. D.; Klein, T. E.

2026-06-11 genetic and genomic medicine
10.64898/2026.06.09.26355281 medRxiv
Show abstract

NAT2 is an important pharmacogene which encodes the N-acetyltransferase 2 enzyme that is involved in the metabolism of multiple medications, and variants in this gene can affect patient response to these medications. CPIC has published a clinical guideline for prescribing hydralazine using NAT2 genotypes. Just prior to the guideline, updated NAT2 star allele numbering and definitions were released, differing somewhat from the historical nomenclature. Clinical pharmacogenomic testing panels often test for the most common star alleles, so knowledge of the most common updated NAT2 star alleles is critical for the implementation of the CPIC NAT2/hydralazine guideline. We first determine NAT2 diplotype frequencies from UK Biobank (UKBB) 200k phased genomes, then analyzed allele, diplotype, and phenotype population frequencies from the All of Us Research program, PennMedicine BioBank (PMBB) and UKBB 500k datasets. We found that analyzing NAT2 diplotypes from phased data provides critical information for algorithms designed to predict diplotypes from unphased data. We observed that NAT2*5, *6, and *4 were the most common star alleles in that order, and the top 11 most frequent NAT2 star alleles were the same across all biobanks. However, differences in star allele frequencies across biogeographical populations were observed. The largest difference led to a higher frequency of NAT2 poor metabolizer phenotypes as compared to rapid and intermediate metabolizer phenotypes in all global populations except in the EAS population, where NAT2 poor metabolizers were in the minority.

Matching journals

The top 10 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.4%
10.2%
2
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.4%
8.5%
3
The American Journal of Human Genetics
206 papers in training set
Top 0.5%
8.5%
4
Scientific Reports
3102 papers in training set
Top 23%
4.9%
5
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.2%
3.6%
6
Nature Communications
4913 papers in training set
Top 40%
3.6%
7
Med
38 papers in training set
Top 0.1%
3.6%
8
Clinical and Translational Science
21 papers in training set
Top 0.2%
3.3%
9
Genetics in Medicine
69 papers in training set
Top 0.5%
3.1%
10
PLOS ONE
4510 papers in training set
Top 43%
2.7%
50% of probability mass above
11
eBioMedicine
130 papers in training set
Top 1%
1.8%
12
eLife
5422 papers in training set
Top 42%
1.7%
13
Translational Psychiatry
219 papers in training set
Top 3%
1.7%
14
Frontiers in Genetics
197 papers in training set
Top 5%
1.7%
15
Frontiers in Pharmacology
100 papers in training set
Top 2%
1.5%
16
BioData Mining
15 papers in training set
Top 0.4%
1.5%
17
Nature Human Behaviour
85 papers in training set
Top 3%
1.3%
18
European Neuropsychopharmacology
15 papers in training set
Top 0.3%
1.3%
19
British Journal of Clinical Pharmacology
21 papers in training set
Top 0.4%
1.2%
20
Communications Medicine
85 papers in training set
Top 0.5%
1.2%
21
Genes
126 papers in training set
Top 2%
1.2%
22
Journal of Translational Medicine
46 papers in training set
Top 2%
1.1%
23
Cell Genomics
162 papers in training set
Top 5%
1.0%
24
GENETICS
189 papers in training set
Top 1%
1.0%
25
Molecular Pharmaceutics
16 papers in training set
Top 0.4%
1.0%
26
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
27
Science Translational Medicine
111 papers in training set
Top 5%
0.9%
28
JMIRx Med
31 papers in training set
Top 1%
0.9%
29
Frontiers in Molecular Biosciences
100 papers in training set
Top 4%
0.8%
30
npj Digital Medicine
97 papers in training set
Top 4%
0.8%