Back

REBEL, Reproducible Environment Builder for Explicit Library resolution

Martelli, E.; Ratto, M. L.; Nuvolari, B.; Arigoni, M.; Tao, J.; Micocci, F. M. A.; Alessandri, L.

2026-04-07 bioinformatics
10.64898/2026.04.04.716498 bioRxiv
Show abstract

BackgroundAchieving FAIR-compliant computational research in bioinformatics is systematically undermined by two compounding challenges that existing tools leave unresolved: long-term reproducibility and accessibility. Standard package managers re-download dependencies from live repositories at every build, making environments vulnerable to library disappearance and version drift, and pinning a package version does not pin the versions of its transitive dependencies, causing divergences between builds performed at different points in time. Compounding this, packages from repositories such as CRAN, Bioconductor, and PyPI frequently omit critical system-level dependencies from their installation metadata, leaving users to manually discover which underlying library is missing or which version is required. Beyond these technical failures, constructing a truly reproducible environment demands expertise in containerization making reproducibility in practice a privilege and not a standard. FindingsWe present REBEL (Reproducible Environment Builder for Explicit Library Resolution), a framework that addresses both challenges through three dependency inference heuristics: (i) Deep Inspection of source code, (ii) Fuzzy Matching against a manually curated knowledge base, and (iii) Conservative Dependency Locking. The resolved dependency stack is then archived into a self-contained local store, enabling offline and deterministic rebuilds at any future time. We compared the installation of 1,000 randomly sampled CRAN packages in isolated Docker containers versus the standard package manager and REBEL resolved 149 of 328 standard installation failures (45.4%). Moreover through its DockerBuilder component, REBEL further generates fully reproducible Docker images from a plain text requirements file, making deterministic environment construction accessible without expertise in containerization. ConclusionsREBEL provides a practical foundation for FAIR-compliant, long-term reproducible bioinformatics analyses, making deterministic environment construction accessible to researchers regardless of their technical background. REBEL is freely available at https://github.com/Rebel-Project-Core

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 1%
23.0%
2
PLOS Computational Biology
1633 papers in training set
Top 2%
14.7%
3
BMC Bioinformatics
383 papers in training set
Top 0.6%
12.8%
50% of probability mass above
4
GigaScience
172 papers in training set
Top 0.1%
10.7%
5
Bioinformatics Advances
184 papers in training set
Top 0.4%
6.5%
6
Nature Communications
4913 papers in training set
Top 34%
4.4%
7
Journal of the American Medical Informatics Association
61 papers in training set
Top 1.0%
2.4%
8
PLOS ONE
4510 papers in training set
Top 47%
2.1%
9
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.8%
10
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
11
PeerJ
261 papers in training set
Top 10%
1.3%
12
Scientific Reports
3102 papers in training set
Top 66%
1.3%
13
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
14
Journal of Open Source Software
22 papers in training set
Top 0.2%
0.9%
15
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.8%
16
Cell Systems
167 papers in training set
Top 12%
0.8%
17
Nucleic Acids Research
1128 papers in training set
Top 17%
0.8%
18
Patterns
70 papers in training set
Top 2%
0.8%
19
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 43%
0.8%
20
eLife
5422 papers in training set
Top 57%
0.8%
21
Advanced Science
249 papers in training set
Top 21%
0.7%
22
SoftwareX
15 papers in training set
Top 0.6%
0.5%
23
BMC Genomics
328 papers in training set
Top 7%
0.5%