Research▲ 137 · 13 cmts
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling
MaxProof introduces a population-level test-time scaling framework for mathematical proof generation, leveraging a generative-verifier architecture that integrates proof generation, verification, and critique-conditioned repair. The M3 model, designed for a low false-positive rate, achieves notable benchmark results of 35/42 on IMO 2025 and 36/42 on USAMO 2026, surpassing human gold-medal thresholds. This framework enhances the capabilities of LLMs in formal reasoning tasks, providing a competitive edge in mathematical proof competitions.
mathematical proofgenerative verifierRLM3