Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning
The paper introduces \textsc{TeLAPA} (Transfer-Enabled Latent-Aligned Policy Archives), a continual reinforcement learning framework that organizes behaviorally diverse policy neighborhoods into per-task archives while maintaining a shared latent space. In experiments conducted in the MiniGrid CL setting, \textsc{TeLAPA} demonstrated improved task learning success, faster recovery on revisited tasks, and higher overall performance across a sequence of tasks compared to traditional single-model preservation methods. This approach emphasizes the importance of maintaining multiple skill-aligned policies, which enhances adaptability and plasticity in lifelong learning agents, offering a significant advancement for practitioners in continual RL.