Back

Mechanistic learning to predict and understand minimal residual disease

Marzban, S.; Robertson-Tessi, M.; West, J.

2026-04-21 cancer biology
10.64898/2026.04.16.718968 bioRxiv
Show abstract

Mechanistic modeling has long been used as a tool to describe the dynamics of biological systems, especially cancer in response to treatment. Their key advantage lies in interpretability of relationships between input parameters and outcomes of interest. In contrast, machine learning techniques offer strong prediction performance, especially for high dimensional datasets that are common in oncology. Here, we employ a Mechanstic Learning framework that combines the advantages of both approaches by training machine learning models on mechanistic parameters inferred from clinical patient data. The mechanistic model (a Markov chain model) contains sixteen parameters that describe the rate of cell fate transitions that occur in patients with B-cell precursor acute lymphoblastic leukemia. The machine learning (a ridge logistic regression model) is trained on these parameters to predict two clinically-relevant features: BCR::ABL1 fusion gene status (positive or negative) and minimal residual disease status (positive or negative) post-induction chemotherapy. Model training is done in an iterative fashion to assess which (and how many) parameters are critical to maintain high predictive performance. Using machine learning models trained on the clinical flow-cytometry data, we find that the stem-like cell state alone is the most predictive feature for both BCR::ABL1-positive and MRD-positive disease, with combination scores (defined as the average of accuracy, balanced accuracy, and area under the curve) of 0.80 and 0.67, respectively. By comparison, mechanistic learning achieves comparable or improved combination scores for BCR::ABL1-positive and MRD-positive disease, with scores of 0.81 and 0.71, respectively, using only de-differentiation for BCR::ABL1 and primitive-state persistence together with differentiation-directed exit for MRD. Thus, the mechanistic-learning approach not only preserves predictive performance, but also provides a biological hypothesis for why stemness is predictive of these clinically relevant outcomes.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 4%
8.2%
2
Scientific Reports
3102 papers in training set
Top 12%
7.2%
3
npj Precision Oncology
48 papers in training set
Top 0.1%
6.8%
4
Clinical Cancer Research
58 papers in training set
Top 0.2%
6.8%
5
Nature Communications
4913 papers in training set
Top 29%
6.4%
6
Cancers
200 papers in training set
Top 1%
4.8%
7
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.2%
4.0%
8
Communications Biology
886 papers in training set
Top 2%
3.7%
9
Cancer Research
116 papers in training set
Top 0.8%
3.7%
50% of probability mass above
10
PLOS ONE
4510 papers in training set
Top 40%
3.6%
11
Cancer Research Communications
46 papers in training set
Top 0.1%
3.6%
12
Cell Systems
167 papers in training set
Top 4%
3.1%
13
npj Systems Biology and Applications
99 papers in training set
Top 0.8%
2.1%
14
Cell Reports
1338 papers in training set
Top 23%
1.8%
15
Physical Biology
43 papers in training set
Top 1%
1.7%
16
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.5%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 35%
1.5%
18
iScience
1063 papers in training set
Top 19%
1.3%
19
Patterns
70 papers in training set
Top 1%
1.2%
20
Cell Reports Medicine
140 papers in training set
Top 6%
0.9%
21
eLife
5422 papers in training set
Top 52%
0.9%
22
npj Digital Medicine
97 papers in training set
Top 3%
0.9%
23
Bioinformatics
1061 papers in training set
Top 9%
0.9%
24
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
25
Genome Medicine
154 papers in training set
Top 7%
0.8%
26
Frontiers in Genetics
197 papers in training set
Top 9%
0.8%
27
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
28
Cell Reports Methods
141 papers in training set
Top 5%
0.7%
29
BMC Cancer
52 papers in training set
Top 3%
0.7%
30
Cancer Cell
38 papers in training set
Top 2%
0.7%