TrainingarXiv cs.AI — 7 d ago

SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing

SkillAudit is a new framework designed for evolving agent skills in large language models (LLMs) without relying on ground-truth feedback. It employs paired trajectory auditing to assess the impact of skills on agent behavior, using Process-Aligned Contrastive Evaluation (PACE) to translate behavioral differences into actionable edit guidance. In tests across 89 containerized tasks in various professional domains, SkillAudit achieved an average task reward of 73.9%, significantly outperforming both agents without skills and static expert skills, highlighting its potential for practitioners to enhance LLM performance in real-world applications without needing external validation data.

agent skillstrajectory auditingrelevance 0.00 · engagement 0.00

Read at source ↗← all news