Training
Robust $Q$-learning for mean-field control under Wasserstein uncertainty in common noise
The article presents a robust $Q$-learning algorithm tailored for discrete-time mean-field control problems dealing with Wasserstein uncertainty in common noise. It integrates a quantization-and-projection approach with a Wasserstein dual reformulation, demonstrating convergence and finite-time iteration bounds for both synchronous and asynchronous learning. The findings, including numerical experiments that highlight the robustness-performance tradeoff and convergence behavior, are crucial for practitioners addressing uncertainty in control problems within AI systems.
q-learningreinforcement-learningrobustness