Continuous Temporal Difference Learning as a Unifying Theory of Dopamine Function

Garud, S.; Morris, L.

2026-04-08 neuroscience

10.64898/2026.04.07.716149 bioRxiv

Show abstract

Dopamine neurons are thought to signal reward prediction errors phasically and the opportunity cost of time tonically, while also displaying ramping activity during goal approach, and coupling with movement. These are often treated as distinct modes of dopamine function, each requiring its own computational explanation. Here we show that all can be unified by considering temporal difference learning within the context of continuous time, combined with the assumption that the brain computes value changes through a fast model-based process while maintaining a slower model-free cache. Together, the inclusion of these two ingredients explains phasic responses, tonic modulation between reward contexts, navigation ramps, speed scaling, and the fading of ramps with learning, without invoking separate mechanisms. We confirm these predictions across two independent datasets of dopamine recordings in rodents spanning freely-moving and head-fixed paradigms. Continuous temporal difference learning may thus provide a unified theory of dopamine function.

Continuous Temporal Difference Learning as a Unifying Theory of Dopamine Function

Matching journals