HP2NET: Empowering Efficient Phylogenetic Network Analysis through High-Performance Computing
Terra, R.; Carvalho, D.; Machado, D. J.; Osthoff, C.; Ocana, K.
Show abstract
Advances in High-Performance Computing (HPC) have enabled increasingly complex genomic analyses, including those in phylogenomics. These analyses contribute to understanding the evolution of viruses and pathogens, improving our knowledge of disease transmission, and supporting targeted public health strategies. However, due to the increasing number of tools and processing steps involved, executing these analyses manually, step by step, becomes error-prone and inefficient. To address this challenge, we present HP2NET, a robust framework for reproducible, efficient, and scalable phylogenetic network analysis. HP2NET integrates five workflows based on state-of-the-art tools such as PhyloNetworks and PhyloNet, allowing the analysis of multiple datasets and workflows in a single execution. The framework includes features such as task packaging and data reuse to improve performance and resource utilization in HPC environments. We perform a comprehensive performance evaluation of the software used within HP2NET, identifying bottlenecks and analyzing gains from parallel processing. Data reuse provided up to 15.35% time reduction, for a small dataset, in our experimental environment, while parallel execution of the five pipelines reduced total runtime by up to 90.96% compared to sequential runs. Finally, we validate HP2NET in a real-world case study by analyzing Dengue virus genomes, demonstrating its applicability value for large-scale phylogenetic analyses.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.