Safety
The Interlocutor Effect: Why LLMs Leak More Personal Data to Agents Than Humans
The paper presents the concept of the "Interlocutor Effect," highlighting that LLMs exhibit increased leakage of Personally Identifiable Information (PII) when interacting with AI agents compared to human users. An ablation study on Llama-3.1-8B-Instruct reveals that deactivation of safety-aligned attention heads during agent interactions can lead to a 23 percentage point increase in PII leakage. This research underscores the need for enhanced privacy mechanisms in multi-agent systems to mitigate potential data exposure risks.
llmprivacyagents