AgentsarXiv cs.AI — 7 d ago

The Emergence of Autonomous Penetration Capabilities in Large Language Model-Powered AI Systems

The article presents a new evaluation framework for autonomous penetration capabilities in LLM-powered AI systems, addressing limitations in existing methodologies. The framework includes two tiers of target server environments and employs a general-purpose agent architecture with cybersecurity tools, assessing 19 open-weight and proprietary LLMs. Results indicate penetration success rates between 10.7% and 69.3%, highlighting that advancements in LLM capabilities correlate with improved autonomous penetration performance, which is critical for understanding AI's role in cybersecurity.

autonomous AIcybersecurityLLMpenetration testingrelevance 0.00 · engagement 0.00

Read at source ↗← all news