ResearcharXiv cs.AI — 21 h ago

Benchmarking Knowledge Editing using Logical Rules

A new benchmark for knowledge editing in large language models (LLMs) has been introduced, focusing on evaluating the logical consequences of single fact edits rather than just recalling edited facts. This benchmark utilizes logical rules extracted from knowledge graphs to generate multi-hop questions, revealing that existing methods like ROME and FT often fall short in injecting entailed knowledge, with a performance gap of up to 24%. This underscores the necessity for semantics-aware evaluation frameworks to enhance the effectiveness of knowledge editing techniques in LLMs.

llmknowledge-editingbenchmarkrelevance 0.00 · engagement 0.00

Read at source ↗← all news