Back

Global whole-genome phylogenomics of Nakaseomyces glabratus reveals admixture and refines sequence type-based classification

Adamu Bukari, A.-R.; Sidney, B.; Gerstein, A. C.

2026-04-04 evolutionary biology
10.64898/2026.04.03.716392 bioRxiv
Show abstract

Nakaseomyces glabratus is a globally distributed opportunistic fungal pathogen. An ongoing discussion in studies of N. glabratus population structure has been whether genetic clusters are best defined using multilocus sequence typing (MLST) or short-read whole-genome sequencing (WGS). To assess the concordance between MLST- and WGS-based phylogenies, we analyzed a dataset of 548 N. glabratus WGS sequences from 12 countries. Clusters identified from WGS largely recapitulated the MLST-defined sequence type (ST) groups: fourteen WGS clusters were composed of a single MLST ST, and the remaining contained STs with very closely related MLST profiles. We thus propose a pragmatic naming convention, consistent with the system used in other microbial species, which specifies WGS cluster labels based on the primary ST. From the large WGS isolate dataset, we determined the prevalence of admixture and genomic variants. Interestingly, seven of the nine singleton isolates were admixed, in addition to 58 isolates from six different clusters. Aneuploidy was detected in 4% of isolates, most commonly in chrE, which contains ERG11, the gene encoding the enzyme targeted by azole antifungals. Aneuploid chromosomes did not exhibit elevated heterozygosity relative to the sequencing error rate, consistent with instability of extra chromosome copies. Copy number variants were found in 3% of the isolates; some of the CNVs co-occurred with aneuploidies, and were primarily identified on chrD, chrE, chrI, and chrM. Our findings demonstrate that deep splits between clusters preserve the utility of MLST ST designations for clade-level designation, yet underscore the utility of WGS for high-resolution genomic analyses. Article SummaryThere is an ongoing debate in studies on Nakaseomyces glabratus about whether traditional MLST analysis is sufficient to determine population structure, or whether the precision of whole genome sequencing (WGS) is necessary. We analyzed WGS data from 548 isolates from around the world. We found a very strong agreement between the two methods. We propose a hybrid naming system, where cluster names are based on the dominant MLST group. We used the WGS data to show that admixed isolates, and those with extra chromosomes or CNVs are rare (<7% of isolates in each class) and are distributed throughout the phylogeny.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
mSphere
281 papers in training set
Top 0.1%
14.3%
2
mBio
750 papers in training set
Top 1%
12.3%
3
Microbial Genomics
204 papers in training set
Top 0.2%
10.0%
4
Molecular Ecology
304 papers in training set
Top 1%
4.8%
5
G3
33 papers in training set
Top 0.1%
4.8%
6
Frontiers in Fungal Biology
10 papers in training set
Top 0.1%
4.3%
50% of probability mass above
7
Genome Biology and Evolution
280 papers in training set
Top 0.4%
3.9%
8
Evolutionary Applications
91 papers in training set
Top 0.2%
3.9%
9
Fungal Genetics and Biology
14 papers in training set
Top 0.1%
2.4%
10
Frontiers in Microbiology
375 papers in training set
Top 5%
1.9%
11
Microbiome
139 papers in training set
Top 2%
1.8%
12
mSystems
361 papers in training set
Top 4%
1.8%
13
Scientific Reports
3102 papers in training set
Top 59%
1.7%
14
BMC Genomics
328 papers in training set
Top 2%
1.7%
15
PLOS ONE
4510 papers in training set
Top 54%
1.7%
16
PLOS Biology
408 papers in training set
Top 11%
1.5%
17
Environmental Microbiology
119 papers in training set
Top 2%
1.5%
18
Nature Communications
4913 papers in training set
Top 54%
1.5%
19
BMC Biology
248 papers in training set
Top 2%
1.2%
20
FEMS Microbiology Ecology
47 papers in training set
Top 0.3%
0.9%
21
eLife
5422 papers in training set
Top 52%
0.9%
22
Communications Biology
886 papers in training set
Top 17%
0.9%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 41%
0.9%
24
Molecular Biology and Evolution
488 papers in training set
Top 4%
0.9%
25
PLOS Genetics
756 papers in training set
Top 14%
0.8%
26
Applied and Environmental Microbiology
301 papers in training set
Top 3%
0.8%
27
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
0.8%
28
Molecular Ecology Resources
161 papers in training set
Top 1%
0.8%
29
Phytopathology®
28 papers in training set
Top 0.6%
0.7%
30
The ISME Journal
194 papers in training set
Top 3%
0.7%