TrainingarXiv cs.AI — 21 h ago

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

The paper introduces Bootstrapped Flow Q-Learning (BFQ), a novel framework for offline reinforcement learning that facilitates accurate single-step action generation without the need for auxiliary networks or policy distillation. BFQ employs a divide-and-conquer approach to learn short-range displacements, enabling a direct noise-to-action mapping that eliminates the computational overhead associated with multi-step denoising. Evaluations on the D4RL benchmark indicate that BFQ not only enhances performance but also significantly reduces computational costs compared to traditional multi-step diffusion methods, making it a more efficient alternative for practitioners in the field.

offline-reinforcement-learningq-learningbootstrappedrelevance 0.00 · engagement 0.00

Read at source ↗← all news