Back

HP2NET: Empowering Efficient Phylogenetic Network Analysis through High-Performance Computing

Terra, R.; Carvalho, D.; Machado, D. J.; Osthoff, C.; Ocana, K.

2026-03-08 bioinformatics
10.64898/2026.03.05.709005 bioRxiv
Show abstract

Advances in High-Performance Computing (HPC) have enabled increasingly complex genomic analyses, including those in phylogenomics. These analyses contribute to understanding the evolution of viruses and pathogens, improving our knowledge of disease transmission, and supporting targeted public health strategies. However, due to the increasing number of tools and processing steps involved, executing these analyses manually, step by step, becomes error-prone and inefficient. To address this challenge, we present HP2NET, a robust framework for reproducible, efficient, and scalable phylogenetic network analysis. HP2NET integrates five workflows based on state-of-the-art tools such as PhyloNetworks and PhyloNet, allowing the analysis of multiple datasets and workflows in a single execution. The framework includes features such as task packaging and data reuse to improve performance and resource utilization in HPC environments. We perform a comprehensive performance evaluation of the software used within HP2NET, identifying bottlenecks and analyzing gains from parallel processing. Data reuse provided up to 15.35% time reduction, for a small dataset, in our experimental environment, while parallel execution of the five pipelines reduced total runtime by up to 90.96% compared to sequential runs. Finally, we validate HP2NET in a real-world case study by analyzing Dengue virus genomes, demonstrating its applicability value for large-scale phylogenetic analyses.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.9%
26.1%
2
BMC Bioinformatics
383 papers in training set
Top 0.3%
18.8%
3
GigaScience
172 papers in training set
Top 0.2%
6.4%
50% of probability mass above
4
SoftwareX
15 papers in training set
Top 0.1%
3.6%
5
PLOS ONE
4510 papers in training set
Top 42%
3.1%
6
Bioinformatics Advances
184 papers in training set
Top 2%
2.1%
7
Nature Communications
4913 papers in training set
Top 46%
2.1%
8
Scientific Reports
3102 papers in training set
Top 53%
1.9%
9
PLOS Computational Biology
1633 papers in training set
Top 14%
1.9%
10
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
11
Molecular Biology and Evolution
488 papers in training set
Top 2%
1.8%
12
Patterns
70 papers in training set
Top 0.8%
1.7%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.7%
14
iScience
1063 papers in training set
Top 17%
1.5%
15
Genome Biology
555 papers in training set
Top 5%
1.5%
16
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.5%
17
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.2%
18
Journal of Open Source Software
22 papers in training set
Top 0.2%
0.8%
19
Viruses
318 papers in training set
Top 5%
0.8%
20
Nature Computational Science
50 papers in training set
Top 2%
0.8%
21
Journal of Proteome Research
215 papers in training set
Top 2%
0.8%
22
Genome Research
409 papers in training set
Top 4%
0.8%
23
Communications Biology
886 papers in training set
Top 26%
0.7%
24
Nature Methods
336 papers in training set
Top 6%
0.7%
25
eLife
5422 papers in training set
Top 61%
0.6%
26
Computational and Structural Biotechnology Journal
216 papers in training set
Top 11%
0.6%
27
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.6%
28
mSphere
281 papers in training set
Top 7%
0.6%
29
PeerJ
261 papers in training set
Top 19%
0.5%
30
Frontiers in Bioinformatics
45 papers in training set
Top 2%
0.5%