ai-digest.dev
last updated 3 min ago
ResearcharXiv cs.CL 2 d ago

Benchmarking and Exploring the Capabilities of LLMs for Attack Investigations

The paper introduces AuditBench, a benchmark dataset designed for evaluating LLMs in the context of security-related system audit log investigations, encompassing over 50 scenarios of both benign and malicious activities. It assesses the performance of five leading LLMs across four common log-investigation tasks, highlighting how model size, data representation, and prompt construction influence outcomes and error profiles. This work is significant for practitioners as it provides a structured framework for assessing LLM capabilities in security operations and identifies areas for improvement in future model development.

llmaudit logsbenchmarkingrelevance 0.00 · engagement 0.00
Read at source ↗← all news