Back

Hierarchical integration of multimodal clinical data to predict epilepsy surgery outcome

Thomas, J.; Abdallah, C.; Aung, T.; Bosque-Varela, P.; Dolezalova, I.; Parikh, P.; Wadi, L.; Jaber, K.; Kai, Z.; Ho, A.; Moye, M. K.; Minato, E.; Aron, O.; Chabardes, S.; Colnat-Coulbois, S.; Hall, J.; Klimes, P.; Minotti, L.; Dubeau, F.; Southwell, D.; Carlson, D.; Brazdil, M.; Gonzalez-Martinez, J.; Kahane, P.; Maillard, L.; Gotman, J.; Frauscher, B.

2026-05-06 neurology
10.64898/2026.05.05.26352481 medRxiv
Show abstract

BackgroundIntegrating multimodal data into medical artificial intelligence (AI) tools and evaluating whether they outperform human experts remains a critical challenge. Epilepsy surgery offers a unique paradigm for this evaluation, as it provides an expert-independent measure (Engel score) of post-surgical outcome. Currently, evaluation for epilepsy surgery relies on the visual interpretation and human synthesis of multimodal data. While clinical evaluations are individualized and account for complex anatomical variability, integrating these diverse, high-dimensional modalities to generate a probability of surgical success remains challenging. Here, we leverage this objective outcome score to investigate the feasibility of a data-driven, phenotype-based model against the current clinical gold standard. MethodsThe evaluation was performed on an epilepsy-type controlled cohort of 57 patients from six tertiary epilepsy surgery centers who underwent resective/ablative surgery in the mesiotemporal lobe. Multimodal data, namely, patient demographics, semiology, invasive electrophysiology monitoring, and neuroimaging, were utilized. We first estimated how human experts perceive surgery success. Subsequently, we developed a data-driven model integrating these modalities to predict surgery outcomes. The model performance was compared to the current clinical gold standard (three independent human experts) and published outcome calculators. Finally, modality-level phenotypes were derived based on the models predictions. ResultsPredictions by human experts correlated poorly with post-surgical outcomes, and published outcome calculators did not perform better than the experts (DeLongs p = 0.367). Our model incorporating multimodal data achieved an area under the receiver operating characteristic curve (AUROC) of 0.801. It performed statistically better than the best human expert (DeLongs p = 0.043) and achieved a higher AUROC than the best published surgical outcome calculator (0.801 vs. 0.694). ConclusionsWe demonstrated the proof-of-concept that data-driven multimodal phenotypes can inform personalized surgery planning in epilepsy. Furthermore, we provide a framework for integrating multimodal data and benchmarking medical AI performance against human experts.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Epilepsia
49 papers in training set
Top 0.2%
14.9%
2
Scientific Reports
3102 papers in training set
Top 12%
7.3%
3
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.4%
6.9%
4
Epilepsy Research
12 papers in training set
Top 0.1%
6.5%
5
Epilepsy & Behavior
12 papers in training set
Top 0.1%
4.9%
6
PLOS ONE
4510 papers in training set
Top 30%
4.9%
7
Computers in Biology and Medicine
120 papers in training set
Top 1%
2.9%
8
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.9%
2.9%
50% of probability mass above
9
Epilepsia Open
14 papers in training set
Top 0.2%
2.8%
10
npj Digital Medicine
97 papers in training set
Top 2%
2.6%
11
Artificial Intelligence in Medicine
15 papers in training set
Top 0.2%
2.1%
12
Neuroinformatics
40 papers in training set
Top 0.4%
1.8%
13
iScience
1063 papers in training set
Top 13%
1.8%
14
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
15
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
16
Brain Communications
147 papers in training set
Top 2%
1.7%
17
Frontiers in Integrative Neuroscience
12 papers in training set
Top 0.1%
1.5%
18
Frontiers in Neuroscience
223 papers in training set
Top 4%
1.5%
19
Journal of Neural Engineering
197 papers in training set
Top 1%
1.3%
20
Annals of Neurology
57 papers in training set
Top 1%
1.3%
21
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.3%
22
Clinical Neurophysiology
50 papers in training set
Top 0.4%
1.3%
23
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.6%
0.8%
24
Cortex
102 papers in training set
Top 0.5%
0.8%
25
PLOS Digital Health
91 papers in training set
Top 2%
0.8%
26
Frontiers in Neurology
91 papers in training set
Top 5%
0.7%
27
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.7%
28
Neurocritical Care
11 papers in training set
Top 0.5%
0.7%
29
BMC Neurology
12 papers in training set
Top 1%
0.7%
30
Biomedicines
66 papers in training set
Top 4%
0.7%