Back

Three Dimensions of Compounding Neglect: How Biobanks, Clinical Trials, and Scientific Literature Systematically Exclude the Global South

Corpas, M.; Freidin, M. B.; Valdivia-Silva, J.; Baker, S.; Fatumo, S.; Guio, H.

2026-02-11 public and global health
10.64898/2026.02.10.26346004 medRxiv
Show abstract

Global health inequities are widely documented in outcomes. However, the research systems that generate knowledge, trials, and discovery have rarely been evaluated as an integrated structure. We introduce the Health Equity Informative Metrics (HEIM) framework, a three-dimensional audit of discovery (biobank output), translation (clinical trial activity), and knowledge (semantic organisation of the scientific literature). Analysing 70 international biobanks, 563,725 registered clinical trials, 13.1 million PubMed abstracts, and 175 Global Burden of Disease categories, we demonstrate that exclusion compounds systematically for diseases that primarily burden the Global South. No WHO-classified neglected tropical disease has generated a publication from these 70 biobanks. Clinical trial sites concentrate 2.5-fold in high-income countries relative to disease burden. Diseases disproportionately affecting low-and middle-income regions are 44% more semantically isolated from mainstream biomedical research than other conditions (P < 0.0001, Cohens d = 1.80), limiting cross-disciplinary integration. Nine of the ten most neglected diseases across all dimensions disproportionately affect the Global South, and these disparities show no improvement over 26 years. By contrast, the trajectory of HIV/AIDS demonstrates that sustained, coordinated investment can reverse semantic isolation and integrate a once-marginalised disease into mainstream biomedical networks. HEIM reframes research inequity as a measurable, multi-stage enterprise and establishes a framework for health data accountability.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Medicine
117 papers in training set
Top 0.1%
22.1%
2
eLife
5422 papers in training set
Top 4%
12.4%
3
Nature Communications
4913 papers in training set
Top 19%
9.9%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 7%
9.0%
50% of probability mass above
5
Nature Human Behaviour
85 papers in training set
Top 0.6%
4.8%
6
Cell
370 papers in training set
Top 5%
3.9%
7
Nature Genetics
240 papers in training set
Top 3%
3.2%
8
Science
429 papers in training set
Top 12%
2.3%
9
Nature Microbiology
133 papers in training set
Top 2%
2.0%
10
Scientific Reports
3102 papers in training set
Top 56%
1.7%
11
The Lancet Infectious Diseases
71 papers in training set
Top 2%
1.7%
12
PLOS Biology
408 papers in training set
Top 10%
1.7%
13
PLOS ONE
4510 papers in training set
Top 55%
1.7%
14
npj Digital Medicine
97 papers in training set
Top 2%
1.7%
15
PLOS Global Public Health
293 papers in training set
Top 4%
1.3%
16
Nature
575 papers in training set
Top 12%
1.3%
17
PLOS Medicine
98 papers in training set
Top 3%
1.2%
18
The Lancet Global Health
24 papers in training set
Top 0.9%
1.1%
19
Science Advances
1098 papers in training set
Top 25%
0.9%
20
Emerging Infectious Diseases
103 papers in training set
Top 3%
0.8%
21
International Journal of Epidemiology
74 papers in training set
Top 2%
0.8%
22
Cell Genomics
162 papers in training set
Top 6%
0.8%
23
Science Translational Medicine
111 papers in training set
Top 6%
0.7%
24
The Lancet Digital Health
25 papers in training set
Top 1%
0.7%
25
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.7%
26
Global Change Biology
69 papers in training set
Top 2%
0.7%
27
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 7%
0.6%
28
Cell Host & Microbe
113 papers in training set
Top 6%
0.6%