SafetyarXiv cs.AI — 8 d ago

OSGuard: A Benchmark for Safety in Computer-Use Agents

OSGuard is a newly introduced benchmark suite designed to evaluate the safety of computer-use agents by assessing their performance under unchanged user instructions. It features a dual-granularity approach, including an action-level benchmark for local guardrail decisions and a risk-augmented execution suite for end-to-end evaluation, allowing practitioners to identify unsafe completions that still achieve nominal task objectives. The findings indicate that while current multimodal guardrails excel in isolated action judgments, significant gaps remain in ensuring reliable end-to-end safety, highlighting the need for improved safety mechanisms in AI deployments.

safetyagentsbenchmarkrelevance 0.00 · engagement 0.00

Read at source ↗← all news