ai-digest.dev
last updated 2 h ago
ResearcharXiv cs.AI 8 d ago

The Faithfulness Gap: Certifying Semantic Equivalence Between Natural-Language and Formal Mathematical Statements

The article introduces Bidirectional Provability Fingerprinting (BPF), a framework designed to certify the faithfulness of autoformalization by assessing the semantic equivalence between natural-language mathematical statements and their formal counterparts. Key innovations include Counterfactual Probe Generation (CPG), an Equivalence Spectrum for continuous faithfulness scoring, Adaptive Probe Budget Allocation (APBA), and Faithfulness-Guided Decoding (FGD), which collectively improve the detection of semantic drift in formalizations. The framework demonstrates an 89.6% drift detection rate at a 3.0% false-positive rate, significantly outperforming traditional methods, and the release of the driftbench benchmark with 2,183 pairs facilitates further research in this area, making it a valuable tool for practitioners in formal verification and mathematical proof automation.

autoformalizationmathematicsfaithfulnessrelevance 0.00 · engagement 0.00
Read at source ↗← all news