Back

The Multimodal Anonymizer: a fully local multi-agent AI system for medical data deidentification

Hirsch, A.; Ten, F. W.; Krueger, K. S.; Geyer, R.; Roeschl, T.; Groeschel, M.; Rostin, P.; Eils, R.; Spott, M.; Prasser, F.; Meyer, A.; Madrid, J.

2026-06-05 health informatics
10.64898/2026.05.28.26353952 medRxiv
Show abstract

Background: Safe reuse of multimodal hospital data for AI development is limited by the absence of reliable, context-aware deidentification across multimodal data and longitudinal patient data. Existing approaches are largely modality-specific and can indiscriminately remove clinically important information. Methods: We developed the Multimodal Anonymizer, a modular, locally deployable multi-agent framework integrating multimodal large language models, task-specific neural networks and rule-based transformations. We evaluated 16 orchestrator model configurations on a benchmark built from publicly available data and hospital data from our institution. The benchmark dataset included data from different origins: 250 MIMIC-IV patients with synthetically injected personally identifiable information (PII) supplemented with head CT, face images, handwriting, audio, German clinical-text datasets and local data. Primary outcomes were deidentification sensitivity and preservation of clinically important content; secondary analyses examined model characteristics, reproducibility, and performance against leading market and open-source solutions. Results: The best local configuration (the orchestrator being Qwen3-VL-235B-A22B-Thinking) achieved near-complete deidentification across all datasets, with per-patient sensitivity of 98.80% (95%-CI 97.20; 100), and per-PII sensitivity of 99.82% (95%-CI 99.76; 99.88). Critical clinical preservation was 99.60% (95%-CI 98.80; 100) per-patient, and clinical preservation was 99.61% (95%-CI 99.51; 99.71) per-file. All modalities achieved at least 98.30% sensitivity (lower bound 95%-CI). On our local data, the system achieved a deidentification sensitivity of 100% per-patient and per-PII; and a critical clinical preservation of 100% per-patient as well as a clinical preservation of 99.97% (95%-CI 99.91; 100) per-file. When comparing orchestrators, the leading local models were similar to proprietary models (GPT-5.2) in deidentification sensitivity while showing higher deidentification specificity. The Multimodal Anonymizer outperformed previous tools on most modalities. Conclusion: Near-complete, utility-preserving deidentification of multimodal clinical data is achievable with a unified, locally deployable multi-agent system, enabling safer large-scale reuse of hospital data for research and AI development.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.1%
51.4%
50% of probability mass above
2
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.4%
6.8%
3
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.2%
4.8%
4
Scientific Reports
3102 papers in training set
Top 31%
3.9%
5
The Lancet Digital Health
25 papers in training set
Top 0.2%
3.0%
6
Nature Communications
4913 papers in training set
Top 43%
3.0%
7
International Journal of Medical Informatics
25 papers in training set
Top 0.9%
1.7%
8
Frontiers in Digital Health
20 papers in training set
Top 0.7%
1.7%
9
Journal of Biomedical Informatics
45 papers in training set
Top 0.8%
1.7%
10
PLOS Digital Health
91 papers in training set
Top 2%
1.5%
11
Med
38 papers in training set
Top 0.3%
1.5%
12
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.3%
13
PLOS ONE
4510 papers in training set
Top 58%
1.3%
14
Nature Medicine
117 papers in training set
Top 3%
1.3%
15
Patterns
70 papers in training set
Top 2%
1.1%
16
JAMIA Open
37 papers in training set
Top 1%
0.9%
17
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.8%
18
JMIR Medical Informatics
17 papers in training set
Top 1%
0.8%
19
BMJ Health & Care Informatics
13 papers in training set
Top 0.8%
0.8%
20
iScience
1063 papers in training set
Top 32%
0.7%
21
Artificial Intelligence in Medicine
15 papers in training set
Top 0.8%
0.6%
22
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.6%