Training
Scaling former VibeThinker-1.5B to 3B — now it reaches frontier math & coding performance
The VibeThinker-3B model has been released, scaling from its predecessor VibeThinker-1.5B, and has demonstrated significant performance improvements in mathematical reasoning and coding tasks. It achieved scores of 94.3 on AIME'26, 80.2 on LiveCodeBench v6, 76.4 on IMO-AnswerBench, and 93.4 on IFEval, while successfully passing 123 out of 128 Python submissions on recent LeetCode contests. This advancement suggests that small models can achieve frontier-level reasoning in parameter-dense domains, offering a viable alternative to larger models, although they still face challenges in broader applications.
vibethinkerscalingperformance