InferencearXiv cs.AI — 10 d ago

Early Diagnosis of Wasted Computation in Multi-Agent LLM Systems via Failure-Aware Observability

The article presents a framework for diagnosing wasted computation in multi-agent LLM systems, specifically within a three-agent architecture consisting of an orchestrator, a search agent, and an execution agent. The proposed system utilizes trace-based failure-aware observability to generate online signals regarding computation loops and information gain, complemented by offline semantic grounding metrics. The findings demonstrate that 58.1% of tokens are expended after initial warnings in failed runs, and the implementation of warning-based interventions in a pilot study reduced unnecessary token usage from 63.8% to 30.4%, highlighting the framework's potential to optimize resource allocation in AI systems.

observabilitymulti-agentllmrelevance 0.00 · engagement 0.00

Read at source ↗← all news