TrainingarXiv cs.CL — 2 d ago

Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output

The paper introduces Representation-Aware Advantage Estimation (RAAE), specifically through the Graph-based Advantage Estimation (GraphAE) technique, which utilizes hidden states from reward models (RMs) to enhance advantage estimation in reinforcement learning from human feedback (RLHF). By modeling sampled groups as graphs, where nodes represent responses and edges indicate similarity in RM hidden space, GraphAE enables contextual information propagation, leading to improved performance. Empirical results show significant gains across multiple benchmarks, suggesting that integrating RM representations can enhance sample efficiency and robustness in RLHF applications.

reinforcement learninghuman feedbackadvantage estimationrelevance 0.00 · engagement 0.00

Read at source ↗← all news