Safety
Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation
The paper discusses the evolving security landscape of large language model (LLM) agents as they transition from conversational interfaces to more autonomous software components. It synthesizes findings from 247 studies, identifying key threats such as prompt injection and tool-mediated control-flow hijacking, while highlighting emerging concerns like persistent state corruption. The authors advocate for improved security frameworks that emphasize trust boundaries, privilege control, and realistic evaluation practices, which are crucial for practitioners developing secure LLM applications.
LLM agentssecurityattacks