Safety
From Prompts to Responses: Dual-Sided Data Leakage and Defense in Split Large Language Models
This paper presents a novel approach to address privacy vulnerabilities in Split Large Language Models (Split-LLMs) through the introduction of Patched Model Inversion with Dual-Sided Initialization (PIDI), a two-stage attack that targets both input prompts and generative responses. The authors also propose the Adapter-based DualGuard with Mutual Information Defense (ADMI), which employs a local warmup strategy and mutual information regularization to enhance privacy protection while maintaining task performance. These advancements are crucial for practitioners working with LLMs in privacy-sensitive applications, as they provide new insights into potential data leakage and effective defense mechanisms.
privacydata leakageLLM