Models
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models
The article introduces VibeThinker-3B, a 3 billion parameter dense model designed to enhance verifiable reasoning within small language models. Utilizing a Spectrum-to-Signal post-training paradigm, it incorporates curriculum-based supervised fine-tuning, multi-domain reinforcement learning, and offline self-distillation, achieving high benchmark scores such as 94.3 on AIME26 and 80.2 Pass@1 on LiveCodeBench v6. These results indicate that compact models like VibeThinker-3B can achieve performance comparable to larger models, suggesting a new approach to developing efficient yet powerful AI systems.
verifiable reasoningsmall models