A case report on gendered biases in a Finnish healthcare AI assistant

Luisto, R.; Snell, K.; Vartiainen, V.; Sanmark, E.; Äyrämö, S.

2026-04-14 health informatics

10.64898/2026.04.09.26350383 medRxiv

Show abstract

In this study, we investigate gender bias in a Retrieval-Augmented Generation (RAG) based AI assistant developed for Finnish wellbeing services counties. We tested the system using 36 clinically relevant queries, each rendered in three gendered variants (male, female, gender-neutral), and evaluated responses using both an LLM-as-a-judge approach and a human expert panel consisting of a physician and a sociologist specializing in ethics. We observed substantial and clinically significant differences across gendered variants, including differential treatment urgency, inappropriate symptom associations, and misidentification of clinical context. Female variants disproportionately framed responses around childcare and reproductive health regardless of clinical relevance, reflecting societal stereotypes rather than medical reasoning. Bias manifested both at the LLM generation stage and the RAG retrieval stage, in several cases causing the model to hallucinate responses entirely. Some bias patterns were persistent across repeated runs, while others appeared inconsistently, highlighting the challenge of distinguishing systematic bias from stochastic variation.

Matching journals

●Non-profit ◐University press ○Commercial

The top 6 journals account for 50% of the predicted probability mass.

Only show non-profit

npj Digital Medicine

○ 97 papers in training set

Scientific Reports

○ 3102 papers in training set

Frontiers in Digital Health

○ 20 papers in training set

Philosophical Transactions of the Royal Society B

● 51 papers in training set

Journal of the American Medical Informatics Association

◐ 61 papers in training set

BMJ Health & Care Informatics

● 13 papers in training set

50% of probability mass above

PLOS Digital Health

● 91 papers in training set

● 4510 papers in training set

JMIR Medical Informatics

◐ 17 papers in training set

International Journal of Medical Informatics

○ 25 papers in training set

Journal of Personalized Medicine

○ 28 papers in training set

Computers in Biology and Medicine

○ 120 papers in training set

Journal of Biomedical Informatics

○ 45 papers in training set

IEEE Journal of Biomedical and Health Informatics

● 34 papers in training set

Frontiers in Public Health

○ 140 papers in training set

○ 67 papers in training set

Journal of Medical Internet Research

◐ 85 papers in training set

○ 1063 papers in training set

Nature Medicine

○ 117 papers in training set

Frontiers in Psychiatry

○ 83 papers in training set

○ 16 papers in training set

JCO Clinical Cancer Informatics

● 18 papers in training set

Artificial Intelligence in Medicine

○ 15 papers in training set

BMC Medical Informatics and Decision Making

○ 39 papers in training set

◐ 37 papers in training set

Computer Methods and Programs in Biomedicine

○ 27 papers in training set

The Lancet Digital Health

○ 25 papers in training set

JMIR Formative Research

◐ 32 papers in training set