ai-digest.dev
last updated 2 h ago
ResearcharXiv cs.AI 15 d ago

Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training

The article introduces Curiosity-Critic, a novel approach to intrinsic reward in world model training that enhances exploration by focusing on cumulative prediction error rather than just local transitions. It employs a learned critic to estimate the asymptotic error baseline, allowing the model to differentiate between learnable and stochastic transitions, ultimately improving training speed and final model accuracy in experiments on a stochastic grid world. This method outperforms traditional curiosity-based approaches, providing a more effective framework for practitioners focused on optimizing exploration strategies in reinforcement learning.

curiosityreinforcement-learningworld-modelrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training — AI News Digest