ai-digest.dev
last updated 2 h ago
SafetyarXiv cs.AI 4 d ago

Risk Under Pressure: Compute-Aware Evaluation of Adversarial Robustness in Language Models

The paper presents a compute-aware evaluation framework for assessing the adversarial robustness of large language models (LLMs), focusing on the computational expense of different attack strategies. It introduces risk-compute curves that relate compute budgets to attack risk, revealing that alignment training affects compute-space robustness non-monotonically and that scaling model size can reduce the effectiveness of gradient-based attacks while having limited impact on template-based attacks. This framework, which is made publicly available, allows practitioners to better understand the true costs of adversarial attacks, thereby informing strategies for enhancing model security.

adversarial robustnessllmevaluationrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Risk Under Pressure: Compute-Aware Evaluation of Adversarial Robustness in Language Models — AI News Digest