Agents
Regimes: An Auditable, Held-Out-Gated Improvement Loop Demonstrated on LongMemEval with ActiveGraph
The article introduces "Regimes," an event-sourced agent runtime designed to enhance autonomous improvement loops by integrating a controlled workflow with an append-only event log. This system demonstrates a held-out-gated improvement loop on the ActiveGraph runtime, which diagnoses evaluation failures and proposes repairs, achieving held-out accuracy improvements of +0.05 to +0.10 on LongMemEval-S across multiple splits. This approach provides practitioners with a framework for auditing and refining agent performance, making it easier to trust and validate improvements in AI systems.
autonomous improvementevent sourcingagent runtime