Research
Oops, Wait: Discourse Tokens Matter in Reasoning Model
The paper presents findings on the significance of discourse tokens, such as "wait", in enhancing reasoning capabilities of large language models (LLMs) through data-efficient training methods. It identifies observable token-level patterns that correlate with correct reasoning responses, highlighting that while data-efficient supervised fine-tuning (SFT) can replicate some discourse-token behaviors, it does not achieve the same level of alignment with high-confidence answers as large-scale post-training. This research underscores the importance of discourse tokens in model training, suggesting that their effective utilization could improve reasoning accuracy in AI applications.
reasoningtokenspost-training