Safety
Privacy-Preserving Text Sanitization for Distributed Agents Collaboration via Disentangled Representations
The article introduces DiSan (Disentangled Sanitization), a framework designed for privacy-preserving text sanitization in multi-agent collaborations, integrated within the Intern-Shannon system. DiSan employs a two-stream encoder to separate task semantics from source-identifying stylistic features, enabling federated training without centralizing sensitive text. Experimental results demonstrate that DiSan significantly reduces personally identifiable information (PII) exposure by 20 times while maintaining 83% answer faithfulness, and it lowers stylometric attribution by over 70%, addressing critical privacy concerns in distributed AI systems.
privacydisentangled representationscollaboration