Training
SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing
SkillAudit is a new framework designed for evolving agent skills in large language models (LLMs) without relying on ground-truth feedback. It employs paired trajectory auditing to assess the impact of skills on agent behavior, using Process-Aligned Contrastive Evaluation (PACE) to translate behavioral differences into actionable edit guidance. In tests across 89 containerized tasks in various professional domains, SkillAudit achieved an average task reward of 73.9%, significantly outperforming both agents without skills and static expert skills, highlighting its potential for practitioners to enhance LLM performance in real-world applications without needing external validation data.
agent skillstrajectory auditing