Training
Temporal Self-Imitation Learning
Temporal Self-Imitation Learning (TSIL) is a novel reinforcement learning framework designed to enhance the efficiency of long-horizon robot manipulation policies by leveraging temporally efficient successful trajectories as self-supervision. TSIL employs configuration-conditioned adaptive temporal targets and efficiency-weighted self-imitation to refine learning, demonstrating improved learning and task-completion efficiency across 15 manipulation tasks, while also increasing robustness to unstable training conditions. This approach suggests that utilizing the temporal structure of successful behaviors can serve as a scalable self-supervisory signal, reducing reliance on manually engineered rewards.
reinforcement learningself-imitationrobotics