AgentsarXiv cs.AI — 8 d ago

RAMAC: Multimodal Risk-Aware Offline Reinforcement Learning and the Role of Behavior Regularization

The paper introduces the Risk-Aware Multimodal Actor-Critic (RAMAC), a novel model-free offline reinforcement learning framework that integrates an expressive generative actor with a distributional critic. RAMAC optimizes a composite objective combining Conditional Value-at-Risk (CVaR) and behavioral cloning to enhance risk-sensitive learning in multimodal environments, addressing the challenge of out-of-distribution actions that can lead to failures. Experimental results on Stochastic-D4RL demonstrate consistent improvements in CVaR while achieving strong returns, highlighting its potential for applications in safety-critical domains.

reinforcement-learningriskrelevance 0.00 · engagement 0.00

Read at source ↗← all news