Agents
GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents
GateMem is a newly introduced benchmark designed to evaluate memory governance in multi-principal shared-memory agents, addressing the shortcomings of existing benchmarks that focus on single-user scenarios. It assesses utility in long-horizon requests, access control, and active forgetting across various domains, including healthcare and education, utilizing long-form multi-party episodes and structured judging. The findings indicate that while long-context prompting can yield high governance scores, current methods still struggle with balancing utility, access control, and reliable forgetting, highlighting the need for improved memory management in shared institutional applications.
memorygovernanceshared-memory