TrainingarXiv cs.AI — 10 d ago

Deep Q-Learning on H\"older Spaces

The paper presents an analysis of Q-learning in continuous-time stochastic control, focusing on the Bellman optimality target's regularity and approximation complexity in a diffusion setting. It introduces a tensor-product DeepONet architecture that accommodates the mixed regularity of the problem, demonstrating that Bellman updates maintain Lipschitz dependence on actions while enhancing state variable regularity. This work contributes to Q-learning theory by establishing approximation bounds and a stiffness-complexity trade-off, crucial for practitioners developing algorithms in continuous environments, although it stops short of providing a full convergence theorem for practical implementations.

deep q-learningreinforcement learningcontinuous timerelevance 0.00 · engagement 0.00

Read at source ↗← all news