Back

AI/ML-based prediction of TB treatment failure: A systematic review and meta-analysis

Kamulegeya, R.; Nabatanzi, R.; Semugenze, D.; Mugala, F.; Takuwa, M.; Nasinghe, E.; Musinguzi, D.; Namiiro, S.; Katumba, A.; Ssengooba, W.; Nakatumba-Nabende, J.; Kivunike, F. N.; Kateete, D. P.

2026-04-22 infectious diseases
10.64898/2026.04.16.26350453 medRxiv
Show abstract

BackgroundTuberculosis (TB) remains a leading cause of infectious disease mortality worldwide, and treatment failure contributes to ongoing transmission, drug resistance, and poor clinical outcomes. Artificial intelligence and machine learning approaches have attracted growing interest for predicting tuberculosis treatment outcomes, but the literature is heterogeneous and lacks a comprehensive synthesis. MethodsWe conducted a systematic review and meta-analysis of studies that developed or validated machine learning models to predict TB treatment failure. We searched PubMed/MEDLINE and Embase from January 2000 to October 2025. Studies were eligible if they developed, validated, or implemented an artificial intelligence or machine learning model for the prediction of TB treatment failure or a closely related poor outcome in patients receiving anti-TB treatment. Risk of bias was assessed using the Prediction model Risk Of Bias Assessment Tool. Random-effects meta-analysis was performed to pool area under the curve values, with subgroup analyses and meta-regression to explore heterogeneity. ResultsThirty-four studies were included in the systematic review, of which 19 reported area under the curve values suitable for meta-analysis (total participants, 100,790). Studies were published between 2014 and 2025, with 91% published from 2019 onward. Tree-based methods were the most common algorithm family (52.9%), and multimodal models integrating three or more data types were used in 41.2% of studies. The pooled area under the curve was 0.836 (95% confidence interval 0.799-0.868), with substantial heterogeneity (I{superscript 2} = 97.9%). In subgroup analyses, studies including HIV-positive participants showed lower discrimination (pooled area under the curve 0.748) compared to those excluding them (0.924). Only eight studies (23.5%) performed external validation, and only one study (2.9%) was rated as low risk of bias overall, primarily due to methodological concerns in the analysis domain. Eggers test suggested publication bias (p = 0.024). Major evidence gaps included underrepresentation of high-burden countries, HIV-affected populations, social determinants, pediatric TB, and extrapulmonary disease. ConclusionsMachine learning models for predicting TB treatment failure show promising discrimination but are not yet ready for routine clinical implementation. Performance varies substantially across populations and settings, and methodological limitations, including inadequate validation, poor calibration assessment, and high risk of bias, limit confidence in current estimates. Future research should prioritize rigorous external validation, calibration assessment, and development in underrepresented populations, particularly HIV-affected and high-burden settings. Author SummaryTB kills over a million people annually. While curable, treatment failure remains common and drives ongoing transmission and drug resistance. Researchers increasingly use artificial intelligence and machine learning to predict which patients will fail treatment, but it is unclear if these models are ready for clinical use. We reviewed 34 studies including nearly 1.1 million participants from 22 countries. On average, models correctly distinguished patients who would fail treatment from those who would not 84% of the time, a performance generally considered good. However, this average hid enormous variation. Models developed in populations including HIV-positive people performed substantially worse, suggesting prediction is harder with HIV co-infection. Worryingly, only one study used high-quality methods; 97% had serious flaws in handling missing data, checking calibration, or testing in new populations. Only eight studies validated their models in different settings. To conclude, we found that machine learning is promising in predicting TB treatment failure, but it is not ready for clinical use. Researchers should prioritize validation in high-burden settings, include social determinants, and improve methodological rigor before these tools can help patients.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
PLOS Global Public Health
293 papers in training set
Top 1.0%
9.8%
2
PLOS ONE
4510 papers in training set
Top 23%
8.2%
3
BMJ Global Health
98 papers in training set
Top 0.5%
6.2%
4
BMC Infectious Diseases
118 papers in training set
Top 0.4%
6.2%
5
PLOS Digital Health
91 papers in training set
Top 0.4%
6.2%
6
American Journal of Epidemiology
57 papers in training set
Top 0.2%
6.1%
7
Clinical Infectious Diseases
231 papers in training set
Top 1%
4.7%
8
PLOS Medicine
98 papers in training set
Top 1%
3.5%
50% of probability mass above
9
Open Forum Infectious Diseases
134 papers in training set
Top 0.6%
3.0%
10
Clinical Microbiology and Infection
60 papers in training set
Top 0.3%
2.7%
11
Scientific Reports
3102 papers in training set
Top 46%
2.5%
12
The Lancet Microbe
43 papers in training set
Top 0.5%
2.0%
13
The Lancet Global Health
24 papers in training set
Top 0.5%
2.0%
14
BMJ Open
554 papers in training set
Top 9%
1.7%
15
The Journal of Infectious Diseases
182 papers in training set
Top 3%
1.6%
16
BMC Medical Research Methodology
43 papers in training set
Top 0.6%
1.6%
17
European Respiratory Journal
54 papers in training set
Top 1%
1.3%
18
Tropical Medicine and Infectious Disease
12 papers in training set
Top 0.2%
1.3%
19
International Journal of Infectious Diseases
126 papers in training set
Top 2%
1.3%
20
Tropical Medicine & International Health
15 papers in training set
Top 0.5%
1.1%
21
PLOS Neglected Tropical Diseases
378 papers in training set
Top 4%
1.1%
22
Journal of Clinical Microbiology
120 papers in training set
Top 1%
0.9%
23
BMC Medicine
163 papers in training set
Top 6%
0.9%
24
PLOS Computational Biology
1633 papers in training set
Top 21%
0.9%
25
JAC-Antimicrobial Resistance
13 papers in training set
Top 0.4%
0.8%
26
Emerging Infectious Diseases
103 papers in training set
Top 3%
0.7%
27
eLife
5422 papers in training set
Top 59%
0.7%
28
Epidemiology
26 papers in training set
Top 0.6%
0.7%
29
JAMA Network Open
127 papers in training set
Top 5%
0.7%
30
Frontiers in Public Health
140 papers in training set
Top 9%
0.7%