Training
Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
The paper introduces Bootstrapped Flow Q-Learning (BFQ), a novel framework for offline reinforcement learning that facilitates accurate single-step action generation without the need for auxiliary networks or policy distillation. BFQ employs a divide-and-conquer approach to learn short-range displacements, enabling a direct noise-to-action mapping that eliminates the computational overhead associated with multi-step denoising. Evaluations on the D4RL benchmark indicate that BFQ not only enhances performance but also significantly reduces computational costs compared to traditional multi-step diffusion methods, making it a more efficient alternative for practitioners in the field.
offline-reinforcement-learningq-learningbootstrapped