Back

Towards Superhuman Imitation Learning for Sequential Head-and-Neck Cancer Treatment Decisions

2025-12-15 health informatics Title + abstract only
View on medRxiv
Show abstract

We propose a simulator-driven imitation learning framework for sequential decision making in head and neck cancer (HNC) treatment. Our method, Superhuman Policy Gradient Optimization (SPGO), integrates inverse reinforcement learning principles with policy gradient updates to derive three-stage treatment policies directly from recorded physician decisions. It leverages a pre-trained clinical simulator--combining a variational autoencoder and gradient boosting models--to generate complete, tempora...

Predicted journal destinations