TrainingarXiv cs.AI — 14 d ago

Process-Verified Reinforcement Learning for Theorem Proving via Lean

This article presents a novel approach to reinforcement learning for theorem proving using the Lean proof assistant, which provides structured feedback during training through fine-grained verified rewards. By parsing proof attempts into tactic sequences and utilizing Lean's elaboration for dense credit signals, the authors implement a GRPO-style reinforcement learning objective that enhances performance on benchmarks like MiniF2F and ProofNet. This work emphasizes the potential of symbolic proof assistants to function as process-level reward oracles, bridging the gap between scalable language models and reliable symbolic verification in formal reasoning tasks.

reinforcement-learningtheorem-provingrelevance 0.00 · engagement 0.00

Read at source ↗← all news