Research
SAFE: An LLM-as-Verifier Framework for Evidence-Grounded Multi-Hop Reasoning
The paper introduces SAFE, an LLM-as-verifier framework designed to enhance evidence-grounded multi-hop question answering (QA) by verifying intermediate reasoning steps against provided passages and prior reasoning. SAFE utilizes Knowledge Graph (KG) triples to decompose reasoning into atomic units, enabling the construction of reliable verifier training data and the implementation of stepwise verification during inference. The framework demonstrates an average accuracy improvement of 8.8 percentage points across three multi-hop QA benchmarks, underscoring the importance of validating reasoning processes in LLMs to mitigate spurious correctness.
llmmulti-hopqaverification