Trace2Policy: From Expert Behavior Traces to Self-Evolving Decision Agents
Trace2Policy has been introduced, featuring the Error-driven Iterative Skill Refinement (EISR) mechanism, which systematically improves decision-making rules derived from expert behavior in compliance tasks. The approach achieved a performance increase from approximately 70% to 79.6% accuracy across five LLMs after eight EISR iterations, with a notable improvement when rules are compiled into deterministic Python, yielding a 9.8 percentage point gain over LLM prompts. This method emphasizes rule quality over model capability, presenting a cost-effective alternative to expert-driven processes and demonstrating significant transferability to various benchmarks without requiring re-engineering, which is crucial for practitioners in compliance-sensitive AI applications.