The Faithfulness Gap: Certifying Semantic Equivalence Between Natural-Language and Formal Mathematical Statements
The article introduces Bidirectional Provability Fingerprinting (BPF), a framework designed to certify the faithfulness of autoformalization by assessing the semantic equivalence between natural-language mathematical statements and their formal counterparts. Key innovations include Counterfactual Probe Generation (CPG), an Equivalence Spectrum for continuous faithfulness scoring, Adaptive Probe Budget Allocation (APBA), and Faithfulness-Guided Decoding (FGD), which collectively improve the detection of semantic drift in formalizations. The framework demonstrates an 89.6% drift detection rate at a 3.0% false-positive rate, significantly outperforming traditional methods, and the release of the driftbench benchmark with 2,183 pairs facilitates further research in this area, making it a valuable tool for practitioners in formal verification and mathematical proof automation.