Global whole-genome phylogenomics of Nakaseomyces glabratus reveals admixture and refines sequence type-based classification
Adamu Bukari, A.-R.; Sidney, B.; Gerstein, A. C.
Show abstract
Nakaseomyces glabratus is a globally distributed opportunistic fungal pathogen. An ongoing discussion in studies of N. glabratus population structure has been whether genetic clusters are best defined using multilocus sequence typing (MLST) or short-read whole-genome sequencing (WGS). To assess the concordance between MLST- and WGS-based phylogenies, we analyzed a dataset of 548 N. glabratus WGS sequences from 12 countries. Clusters identified from WGS largely recapitulated the MLST-defined sequence type (ST) groups: fourteen WGS clusters were composed of a single MLST ST, and the remaining contained STs with very closely related MLST profiles. We thus propose a pragmatic naming convention, consistent with the system used in other microbial species, which specifies WGS cluster labels based on the primary ST. From the large WGS isolate dataset, we determined the prevalence of admixture and genomic variants. Interestingly, seven of the nine singleton isolates were admixed, in addition to 58 isolates from six different clusters. Aneuploidy was detected in 4% of isolates, most commonly in chrE, which contains ERG11, the gene encoding the enzyme targeted by azole antifungals. Aneuploid chromosomes did not exhibit elevated heterozygosity relative to the sequencing error rate, consistent with instability of extra chromosome copies. Copy number variants were found in 3% of the isolates; some of the CNVs co-occurred with aneuploidies, and were primarily identified on chrD, chrE, chrI, and chrM. Our findings demonstrate that deep splits between clusters preserve the utility of MLST ST designations for clade-level designation, yet underscore the utility of WGS for high-resolution genomic analyses. Article SummaryThere is an ongoing debate in studies on Nakaseomyces glabratus about whether traditional MLST analysis is sufficient to determine population structure, or whether the precision of whole genome sequencing (WGS) is necessary. We analyzed WGS data from 548 isolates from around the world. We found a very strong agreement between the two methods. We propose a hybrid naming system, where cluster names are based on the dominant MLST group. We used the WGS data to show that admixed isolates, and those with extra chromosomes or CNVs are rare (<7% of isolates in each class) and are distributed throughout the phylogeny.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.