Agents
RAMAC: Multimodal Risk-Aware Offline Reinforcement Learning and the Role of Behavior Regularization
The paper introduces the Risk-Aware Multimodal Actor-Critic (RAMAC), a novel model-free offline reinforcement learning framework that integrates an expressive generative actor with a distributional critic. RAMAC optimizes a composite objective combining Conditional Value-at-Risk (CVaR) and behavioral cloning to enhance risk-sensitive learning in multimodal environments, addressing the challenge of out-of-distribution actions that can lead to failures. Experimental results on Stochastic-D4RL demonstrate consistent improvements in CVaR while achieving strong returns, highlighting its potential for applications in safety-critical domains.
reinforcement-learningrisk