ai-digest.dev
last updated 3 h ago
TrainingarXiv cs.AI 7 d ago

SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing

SkillAudit is a new framework designed for evolving agent skills in large language models (LLMs) without relying on ground-truth feedback. It employs paired trajectory auditing to assess the impact of skills on agent behavior, using Process-Aligned Contrastive Evaluation (PACE) to translate behavioral differences into actionable edit guidance. In tests across 89 containerized tasks in various professional domains, SkillAudit achieved an average task reward of 73.9%, significantly outperforming both agents without skills and static expert skills, highlighting its potential for practitioners to enhance LLM performance in real-world applications without needing external validation data.

agent skillstrajectory auditingrelevance 0.00 · engagement 0.00
Read at source ↗← all news
SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing — AI News Digest