TrainingarXiv cs.AI — 4 d ago▲ 2 · 0 cmts

The Role of Feedback Alignment in Self-Distillation

The paper presents a study on self-distillation in language models, focusing on the impact of feedback alignment during training. It compares three feedback conditions: binary rewards, reference solutions, and step-by-step critiques, finding that step-aligned critiques yield the best performance, outperforming binary rewards by 16.11 points and reference solutions by 5.27 points on average. This research highlights the importance of context design in self-distillation, suggesting that targeted feedback can enhance model retention of effective reasoning while minimizing unnecessary changes to correct outputs, which is critical for practitioners aiming to improve the robustness of LLMs.

self-distillationfeedbackcontextrelevance 0.40 · engagement 0.06

Read at source ↗HN discussion ← all news