TrainingarXiv cs.CL — 14 d ago

PreUnlearn: Auditing Collateral Knowledge Damage Before Large Language Model Unlearning

The paper introduces "PreUnlearn," a framework for auditing collateral knowledge damage prior to unlearning in large language models (LLMs). It quantitatively analyzes the propagation of unlearning effects, revealing a decay pattern of collateral damage that is strongest near the forget set and diminishes with semantic distance, yet persists across domain boundaries. The study emphasizes the importance of forget-set auditing as a predictive task, leveraging interaction features to identify potential risks in unlearning processes, which is critical for practitioners aiming to implement effective and reliable unlearning strategies in LLMs.

llmunlearningknowledgerelevance 0.00 · engagement 0.00

Read at source ↗← all news