Back

PopMAG: A Nextflow pipeline for population genetics analysis based on Metagenome-Assembled Genomes

Sabogal-Rodriguez, D.; Caro-Quintero, A.

2026-01-22 bioinformatics
10.64898/2026.01.21.700883 bioRxiv
Show abstract

Metagenome-assembled genomes (MAGs) are routinely recovered from metagenomic studies, yet the population genetic information embedded within these datasets remains largely underutilized. Analyzing within-species genetic variation can reveal adaptive evolution, selection pressures, and ecological dynamics that are hidden when MAGs are treated as homogeneous entities. Existing tools address individual analysis steps in isolation, requiring manual integration and creating barriers for researchers without extensive bioinformatics expertise. Here we present PopMAG, a Nextflow pipeline and interactive Shiny application that automates population genetics analysis of MAGs. PopMAG integrates quality control, community profiling, competitive read mapping, functional annotation, and microdiversity estimation into a single reproducible workflow. The pipeline calculates key population genetics metrics including nucleotide diversity ({pi}), pN/pS ratios, fixation index (FST), Levins index and SNVs counts with results consolidated into an interactive visualization platform for metadata-driven exploration. We demonstrate PopMAGs utility through analysis of longitudinal cystic fibrosis lung metagenomes, where we detect signatures of antibiotic-driven selection in Pseudomonas aeruginosa efflux pump genes coinciding with treatment intervention. Availability and implementationPopMAG and corresponding documentation are publicly available at https://github.com/daasabogalro/PopMAG.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
17.1%
2
Bioinformatics Advances
184 papers in training set
Top 0.3%
8.2%
3
BMC Bioinformatics
383 papers in training set
Top 1%
6.6%
4
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.2%
6.6%
5
Genome Biology
555 papers in training set
Top 1%
6.2%
6
Microbiome
139 papers in training set
Top 0.7%
4.7%
7
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.7%
50% of probability mass above
8
GigaScience
172 papers in training set
Top 0.3%
4.2%
9
mSystems
361 papers in training set
Top 3%
3.5%
10
Genome Medicine
154 papers in training set
Top 2%
3.5%
11
Microbial Genomics
204 papers in training set
Top 0.7%
3.5%
12
PLOS Computational Biology
1633 papers in training set
Top 12%
2.7%
13
Nature Communications
4913 papers in training set
Top 45%
2.5%
14
Genome Research
409 papers in training set
Top 2%
2.0%
15
Nucleic Acids Research
1128 papers in training set
Top 10%
1.8%
16
PLOS ONE
4510 papers in training set
Top 53%
1.7%
17
Cell Reports Methods
141 papers in training set
Top 2%
1.7%
18
BMC Genomics
328 papers in training set
Top 3%
1.7%
19
Methods in Ecology and Evolution
160 papers in training set
Top 2%
1.3%
20
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
1.1%
21
Molecular Biology and Evolution
488 papers in training set
Top 4%
0.9%
22
Scientific Reports
3102 papers in training set
Top 74%
0.8%
23
Nature Biotechnology
147 papers in training set
Top 8%
0.7%
24
Cell Systems
167 papers in training set
Top 12%
0.7%
25
Nature Computational Science
50 papers in training set
Top 2%
0.7%