Back
Top 0.3%
17.7%
#1
17.7%
Top 0.8%
14.6%
Top 2%
6.4%
Top 2%
6.4%
Top 3%
5.0%
Top 3%
5.0%
Top 67%
3.9%
Top 86%
3.9%
Top 2%
2.8%
Top 2%
1.9%
Top 2%
1.3%
Top 6%
1.3%
Top 4%
1.3%
Top 1%
1.2%
Top 63%
0.9%
Top 3%
0.7%
Top 33%
0.7%
Top 39%
0.5%
Top 14%
0.5%
AlignInsight: A Three-Layer Framework for Detecting Deceptive Alignment and Evaluation Awareness in Healthcare AI Systems
2026-01-21
health informatics
Title + abstract only
View on medRxiv
Show abstract
ImportanceEmerging evidence suggests healthcare AI systems may exhibit deceptive alignment (appearing safe during validation while optimizing for misaligned objectives in deployment) and evaluation awareness (detecting and adapting behavior during audits), undermining regulatory validation frameworks. ObjectiveTo quantify the performance of multi-layer red-teaming approaches in detecting sophisticated healthcare AI safety failures across 10 vulnerability domains. Design, Setting, and Participa...
Predicted journal destinations
1
npj Digital Medicine
85 training papers
2
PLOS Digital Health
88 training papers
3
Journal of the American Medical Informatics Association
53 training papers
4
JAMIA Open
35 training papers
5
Journal of Medical Internet Research
81 training papers
6
BMC Medical Informatics and Decision Making
36 training papers
7
Journal of Biomedical Informatics
37 training papers
8
Scientific Reports
701 training papers
9
PLOS ONE
1737 training papers
10
International Journal of Medical Informatics
25 training papers
11
JMIR Medical Informatics
16 training papers
12
JMIR Formative Research
31 training papers
13
Computers in Biology and Medicine
39 training papers
14
BMC Medical Research Methodology
41 training papers
15
Frontiers in Digital Health
18 training papers
16
BMJ Open
553 training papers
17
Patterns
15 training papers
18
Frontiers in Public Health
135 training papers
19
JAMA Network Open
125 training papers
20
JMIR Public Health and Surveillance
45 training papers